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I. INTRODUCTION 


Run-1 at the Large Hadron Collider (LHC) was a great success, culminating in the discovery of the Higgs boson 
No physics beyond the standard model was discovered in this run, however Run-2, with a larger center-of-mass energy 
and integrated luminosity, will allow for an increased discovery potential for new physics. Precision measurements 
of the Higgs boson and of various electroweak observables will be performed with extraordinary accuracy in new 
kinematic regimes in Run 2. Run-1 achievements, such as the combined ATLAS/CMS measurement of the Higgs 
boson mass with 0.2% accuracy P|, will soon be superseded. For both precision measurements and for discovery of 
possible new physics, it is important to have the proper tools for the calculation of the relevant cross sections. These 
tools include both matrix element determinations at higher orders in perturbative QCD and electroweak theory, 
and precision parton distribution functions (PDFs). The need for precision PDFs was driven home by the recent 
calculation of the inclusive cross section for gluon-gluon fusion to a Higgs boson at NNNLO Q]. As this tour-de-force 
calculation has significantly reduced the scale dependence of the Higgs cross section, the PDF and as uncertainties 
become the dominant remaining theoretical uncertainty (as of the last PDF4LHC recommendation). 

The CTIO parton distribution functions were published at next-to-leading order (NLO) in 2010 Q, followed by 
the CTIO next-to-next-to leading order (NNLO) parton distribution functions in 2013 [6]. These PDF ensembles 
were determined using diverse experimental data from fixed-target experiments, HERA and the Tevatron collider, 
but without data from the LHC. In this paper, we present a next generation of PDFs, designated as CT14. The 
CT14 PDFs include data from the LHC for the first time, as well as updated data from the Tevatron and from 
HERA experiments. Various CT14 PDF sets have been produced at the leading order (LO), NLO and NNLO and 
are available from LHAPDF Q. 

The CTEQ-TEA philosophy has always been to determine PDFs from data on inclusive, high-momentum transfer 
processes, for which perturbative QCD is expected to be reliable. For example, in the case of deep inelastic lepton 
scattering, we only use data with Q > 2 GeV and W > 3.5 GeV. Data in this region are expected to be relatively 
free of non-perturbative effects, such as higher twists or nuclear corrections. Thus, there is no need to introduce 
phenomenological models for nonperturbative corrections beyond the leading-twist perturbative contributions. 

For the majority of processes in the CT14 global analysis, theoretical predictions are now included at the NNLO level 
of accuracy. In particular, a NNLO treatment Q| of heavy-quark mass effects in neutral-current DIS is realized in the 
ACOT-y scheme BQ and is essential for obtaining correct predictions for LHC electroweak cross sections QQ. 
We make two exceptions to this rule, by including measurements for charged-current DIS and inclusive jet production 
at NLO only. In both cases, the complete NNLO contributions are not yet available, but it can be argued based on 
our studies that the expected effect of missing NNLO effects is small relatively to current experimental errors (cf. 
Sec.p. For both types of processes, the NLO predictions have undergone various benchmarking tests. A numerical 
error was discovered and corrected in the implementation of the SACOT-y scheme for charged-current DIS, resulting 
in relatively small changes from CTIO (within the PDF uncertainties). 

As in the CTIO ^obal analysis, we use a charm pole mass of 1.3 GeV, which was shown to be consistent with the 
CTIO data in Ref. [6|. The PDFs for m, d, s (anti-)quarks and the gluon are parametrized at an initial scale of 1.295 
GeV, and the charm quark PDF is turned on with zero intrinsic charm as the scale Q reaches the charm pole mass. 

The new LHC measurements of W/Z cross sections directly probe the flavor separation of u and d (anti-)quarks in 
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an a;-range around 0.01 that was not directly assessed by the previously available experiments. We also include an 


updated measurement of electron charge asymmetry from the D0 collaboration [l^, which probes the d quark PDF 
at a; > 0.1. To better estimate variations in relevant PDF combinations, such as c?(x, Q)/u{x, Q) and d(x, Q)/u{x, Q), 
we increased the number of free PDF parameters to 28, compared to 25 in CTIO NNLO. As another important 
modification, CT14 employs a novel flexible parametrization for the PDFs, based on the use of Bernstein polynomials 
(reviewed in the Appendix). The shape of the Bernstein polynomials is such that a single polynomial is dominant in 
each given x range, reducing undesirable correlations among the PDF parameters that sometimes occurred in CTIO. 
In the asymptotic limits of x —^ 0 or x —>■ 1, the new parametrization forms allow for the possibility of arbitrary 
constant ratios of d/u or d/u, in contrast to the more constrained behavior assumed in CTIO. 

The PDF error sets of the CT14 ensemble are obtained using two techniques, the Hessian method [l^ and Monte- 
Carlo sampling [l^. Lagrange multiplier studies 171 have also been used to verify the Hessian uncertainties, especially 


in regions not well constrained by data. This applies at NNLO and NLO; no error sets are provided at LO due to the 
difficulty in defining meaningful uncertainties at that order. 

A central value of as{Mz) of 0.118 has been assumed in the global fits at NLO and NNLO, but PDF sets at 
alternative values of as{mz) are also provided. CT14 prefers as{Mz) = O.llSto^oM ^.t NNLO (0.117± 0.005 at NLO) 
at 90 % conhdence level (C.L.). These uncertainties from the global QCD hts are larger than those of the data from 
LEP and other experiments included into the world average [19(. Thus, the central PDF sets are obtained using the 
value of 0.118, which is consistent with the world average value and was recommended by the PDF4LHC group 0- 
For the CT14 LO PDFs, we follow the precedent begun in CTEQ6 2l| by supplying two versions, one with a 1-loop 


as{Mz) value of 0.130, and the other with a 2-loop as{Mz) value of 0.118. 

The flavor composition of CT14 PDFs has changed somewhat compared to CTIO due to the inclusion of new 
LHC and Tevatron data sets, to the use of modified parametrization forms, and to the numerical modihcations 
discussed above. The new PDFs are largely compatible with CTIO within the estimated PDF uncertainty. The CT14 
NNLO PDFs have a softer strange quark distribution at low x and a somewhat softer gluon at high x, compared 
to CTIO NNLO. The d/u ratio has decreased at high x in comparison to CTIO, as a consequence of replacing the 
2008 DO electron charge asymmetry (0.75 fb~^ 22|) measurement by the new 9.7 fb“^ data set [0. The d/u ratio 
approaches a constant value in the x —> 1 limit due to the input physics assumption that both dvai and Uvai behave 
as (1 — x)“^ at X —^ 1 with the same value of 02 (reflecting expectations from spectator counting rules), but allowing 
for independent normalizations. The d/u ratio has also changed as a consequence of the new data and the new 
parametrization form. 

The organization of the paper is as follows. In Sec. [Hi we list the data sets used in the CT14 fit and discuss further 
aspects of the global fits for the central CT14 PDFs and for the error sets. In Sec. uni we show various aspects of the 
resultant CT14 PDFs and make comparisons to CTIO PDFs. In Sec. IIVI we show comparisons of NNLO predictions 
using the CT14 PDFs to some of the data sets used in the global fits. Specifically, we compare to experimental 
measurements of jet, W and Z,W c cross sections. In Sec. El we discuss NNLO predictions using the CTI4 PDFs 
for Higgs boson production via the gluon-gluon fusion channel and for top quark and anti-quark pair production. Our 
conclusion is given in Sec. ED 
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II. SETUP OF THE ANALYSIS 

A. Overview of the global fit 

The goal of the CT14 global analysis is to provide a new generation of PDFs intended for widespread use in high- 
energy experiments. As we generate new PDF sets, we include newly available experimental data sets and theoretical 
calculations, and redesign the functional forms of PDFs if new data or new theoretical calculations favor it. All 
changes — data, theory, and parametrization — contribute to the differences between the old and new generations of 
PDFs in ways that are correlated and frequently cannot be separated. The most important, but not the only, criterion 
for the selection of PDFs is the minimization of the log-likelihood that quantifies agreement of theory and data. 
In addition, we make some ’’prior assumptions” about the forms of the PDFs. A PDF set that violates them may be 
rejected even if it lowers x^- For example, we assume that the PDFs are smoothly varying functions of x, without 
abrupt variations or short-wavelength oscillations. This is consistent with the experimental data and sufficient for 
making new predictions. No PDF can be negative at the input scale Qo, to preclude negative cross sections in the 
predictions. Flavor-dependent ratios or cross section asymmetries must also take physical values, which limits the 
range of allowed parametrizations in extreme kinematical regions with poor experimental constraints. For example, 
in the CT14 parametrization we restricted the functional forms of the u and d PDFs so that d{x, Qo)/u{x, Qo) would 
remain finite and nonzero at a; —> 1, cf. the Appendix. We now review every input of the CT14 PDF analysis in turn, 
starting with the selection of the new experiments. 


B. Selection of experiments 


The experimental data sets that are included in the CT14 global analysis are listed in Tables |T]( lepton scattering) and 
im (production of inclusive lepton pairs and jets). There are a total of 2947 data points included from 33 experiments, 
producing x^ value of 3252 for the best fit (with x^/^pt = 1-10). It can be seen from the values of x^ in TablesUand 
m that the data and theory are in reasonable agreement for most experiments. The variable in the last column 
is an “effective Gaussian variable”, first introduced in the Sec. 5 of Ref. Q and defined for the current analysis in 


Refs. 


23j. The effective Gaussian variable quantifies compatibility of any given data set with a particular PDF 


fit in a way that is independent of the number of points Npt^n in the data set. It maps the Xn values of individual 
experiments, whose probability distributions depend on Npt^n in each experiment (and thus, are not identical), onto 
Sn values that obey a cumulative probability distribution shared by all experiments, independently of Npt^n- Values 
of Sn between -1 and -1-1 correspond to a good fit to the n-th experiment (at the 68% G.L.). Large positive values 
(> 2) correspond to a poor fit, while large negative values (< —2) are fit unusually well. 

The goodness-of-fit for GT14 NNLO is comparable to that of our earlier PDFs, but the more flexible parametriza¬ 
tions did result in improved agreement with some data sets. For example, by adding additional parameters to the 
{u,Tl} and {d,d) parton distributions, somewhat better agreement was obtained for the BGDMS and NMC data at 
low values of Q. The quality of the fit can be also evaluated based on the distribution of values, which follows a 
standard normal distribution (of width 1) in an ideal fit. As in the previous fits, the actual 5'„ distribution (cf. the 
solid curve in Fig. [Ij is somewhat wider than the standard normal one (the dashed curve), indicating the presence 
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ID# 

Experimental data set 

^pt,'n 

xl 

Xn/ ^pt,n 

Sn 

101 

BGDMS Fi 

m 

337 

384 

1.14 

1.74 

102 

BGDMS 

[2^ 

250 

294 

1.18 

1.89 

104 

NMG F^/F^ 


123 

133 

1.08 

0.68 

106 

nmg 

m 

201 

372 

1.85 

6.89 

108 

GDHSW 

[27] 

85 

72 

0.85 

-0.99 

109 

GDHSW F^ 

[27] 

96 

80 

0.83 

-1.18 

110 

GOER Tf 

[28] 

69 

70 

1.02 

0.15 

111 

GOER xF^ 

[29] 

86 

31 

0.36 

-5.73 

124 

NuTeV i/p/i SIDIS 

[30] 

38 

24 

0.62 

-1.83 

125 

NuTeV SIDIS 

[30] 

33 

39 

1.18 

0.78 

126 

GOER ufifi SIDIS 

m 

40 

29 

0.72 

-1.32 

127 

GOER unfj, SIDIS 

[3^ 

38 

20 

0.53 

-2.46 

145 

HI a* 

[32] 

10 

6.8 

0.68 

-0.67 

147 

Gombined HERA charm production [3^ 

47 

59 

1.26 

1.22 

159 

HERAl Combined NC and CC DIS [3^ 

579 

591 

1.02 

0.37 

169 

HI Fl 

[3^ 

9 

17 

1.92 

1.7 


TABLE I: Experimental data sets employed in the CT14 analysis. These are the lepton deep-inelastic scattering experiments. 
Npt,n, Xn the number of points and the value of for th® n-th experiment at the global minimum. Sn is the effective 
Gaussian parameter P, 0, [3 quantifying agreement with each experiment. 


of disagreements, or tensions, between some of the included experiments. The tensions have been examined before 
and originate largely from experimental issues, almost independent of the perturbative QCD order or PDF 


parametrization form. A more detailed discussion of the level of agreement between data and theory will be provided 
in Sec. |TVl 



s„ 


FIG. 1: Best-fit Sn values of 33 experiments in the CT14 analysis. 






































7 


ID# 

Experimental data set 

^pt,n 

xl 

Xn/^pt,71 

S'n 

201 

E605 Drell-Yan process 

[37] 

119 

116 

0.98 

-0.15 

203 

E866 Drell-Yan process, crpd/(2crpp) 

[38] 

15 

13 

0.87 

-0.25 

204 

E866 Drell-Yan process, Q^d^app/{dQdxp) 

[39] 

184 

252 

1.37 

3.19 

225 

CDF Run-1 electron Ach, Pxe > 25 GeV 

[40] 

11 

8.9 

0.81 

-0.32 

227 

CDF Run-2 electron Ach, Pxe > 25 GeV 

m 

11 

14 

1.24 

0.67 

234 

D0 Run-2 muon Ach, pre > 20 GeV 

[42] 

9 

8.3 

0.92 

-0.02 

240 

LHCb 7 TeV 35 pb’^ W/Z da/dyi 

[43] 

14 

9.9 

0.71 

-0.73 

241 

LHCb 7 TeV 35 pb“^ Ach, Pti > 20 GeV 

[43] 

5 

5.3 

1.06 

0.30 

260 

D0 Run-2 Z rapidity 

m 

28 

17 

0.59 

-1.71 

261 

CDF Run-2 Z rapidity 

[4^ 

29 

48 

1.64 

2.13 

266 

CMS 7 TeV 4.7 fb“^, muon Ach, Prt > 35 GeV 


11 

12.1 

1.10 

0.37 

267 

CMS 7 TeV 840 pb electron Ach, Pre > 35 GeV 

11 

10.1 

0.92 

-0.06 

268 

ATLAS 7 TeV 35 pb“^ W/Z cross sec., Ach 

[^ 

41 

51 

1.25 

1.11 

281 

D0 Run-2 9.7 fb“^ electron Ach, Pti > 25 GeV 

M 

13 

35 

2.67 

3.11 

504 

CDF Run-2 inclusive jet production 

[49] 

72 

105 

1.45 

2.45 

514 

D0 Run-2 inclusive jet production 

[^ 

110 

120 

1.09 

0.67 

535 

ATLAS 7 TeV 35 pb~^ incl. jet production 


90 

50 

0.55 

-3.59 

538 

CMS 7 TeV 5 fb~^ incl. jet production 

mi 

133 

177 

1.33 

2.51 


TABLE II: Same as Table H] showing experimental data sets on Drell-Yan processes and inclusive jet production. 


1. Experimental data from the LHC 


Much of these data have also been used in previous CT analyses, such as the one that produced the CTIO NNLO 
PDFs. As mentioned, no LHC data were used in the CTIO fits. Nonetheless, the CTIO PDFs have been in good 
agreement with LHC measurements so far. 

As the quantity of the LHC data has increased, the time has come to include the most germane LHC measurements 
into CT fits. The LHC has measured a variety of standard model cross sections, yet not all of them are suitable for 
determination of PDFs according to the CT method. For that, we need to select measurements that are experimentally 
and theoretically clean and are compatible with the global set of non-LHC hadronic experiments. 

In the CT14 study, we select a few such LHC data sets at y/s = 7TeV, focusing on the measurements that provide 
novel information to complement the non-LHC data. From vector boson production processes, we selected W/Z cross 
sections and the charged lepton asymmetry measurement from ATLAS [3 |. the charged lepton asymmetry in the 
electron and muon decay channels Q from CMS, and the W/Z lepton rapidity distributions and charged lepton 


43l| . The ATLAS and CMS measurements primarily impose constraints on the light quark 


asymmetry from LHCb 

and antiquark PDFs at a: > 0.01. The LHCb data sets, while statistically limited, impose minor constraints on u and 
d PDFs at X = 0.05 — 0.1. 


Upon including these measurements, we can relax the parametric constraints on the sea (anti-)quark PDFs of u, 
u, d, and d. In the absence of relevant experimental constraints in the pre-CT14 fits, the PDF parametrizations were 
chosen so as to enforce u/d —> 1, u/d —>■ 1 at x —>■ 0 in order to obtain convergent fits. As reviewed in the Appendix, 





























the CT14 parametrization form is more flexible, in the sense that only the asymptotic power x°'^ is required to be the 
same in all light-quark PDFs in the x —> 0 limit. This choice produces wider uncertainty bands on Uv, and u/d at 
X —>■ 0, with the spread constrained by the newly included LHC data. 

From the other LHC measurements, we now include single-inclusive jet production at ATLAS ^ and CMS 521. 


These data sets provide complementary information to Tevatron inclusive jet production cross sections from CDF 


Run-2 and D0 Run-2 


50( 1 that are also included. The purpose of jet production cross sections is primarily to 


constrain the gluon PDF g{x,Q). While the uncertainties from the LHC jet cross sections are still quite large, they 
probe the gluon PDF across a much wider range of x than the Tevatron jet cross sections. 

One way to gauge the sensitivity of a specific data point to some PDF /(x, Q) at a given x and Q is to compute 
a correlation cosine between the theoretical prediction for this point and f{x,Q) [iSj, [la, l56|. In the case of CTIO 
NNLO, the sensitivity of the LHC charge asymmetry data sets to the valence PDF combinations at x = 0.01 — 0.1 


was established by this method in Sec. 7C of [6|. However, the somewhat large strength of correlations at small x 
that had been observed suggested the possibility that CTIO light-quark parametrizations were not sufficiently flexible 
in the x region probed by the LHC charge asymmetry. 

Since CT14 has adopted more flexible parametrizations for the affected quark flavors, the above correlations with 
Uv, dy, and d/u aX small x are now somewhat relaxed, as illustrated by the newly computed correlations between 
CT14 NNLO and CMS A^h data in Fig. [2l Each line shows cos4> between f{x,Q) and the NNLO prediction for one 
of the bins of the data. When the PDF uncertainty receives a large contribution from /(x, Q), cos(/) comes out to be 
close to ±1, say, jcos^j > 0.7. With the new parametrization form, the CMS charge asymmetry is reasonably, but 
not exceptionally, correlated with both d/u and d/u at x ^ 0.01 corresponding to central-rapidity production of weak 
bosons at y/s = 7 TeV (indicated by a vertical dashed line in the figure). The correlation with Uy and dy is smaller 
than in CTIO. 


For the ATLAS [SlJ, CMS [^, CDF and DO [50j inclusive jet data sets, the correlation cosine, cos (j), for gluon 
PDF is plotted in Fig.[3]using NLO QCD theory to evaluate the theoretical cross section. Again, the lines correspond 
to individual pxj bins of the data. We observe that the CDF and DO jet cross sections are highly correlated with 
the gluon PDF g(x, Q) at x > 0.05, and anticorrelated at small x as a consequence of the momentum sum rule. The 
ATLAS and CMS jet cross sections are highly correlated with g{x, Q) in a much wider range, x > 0.005. In contrast, 
the PDF-induced correlation of the jet cross sections with the quark PDFs, such as u{x,Q) in the Fig. 01 is at most 
moderate. The ATLAS and CMS jet data therefore have the potential to reduce the gluon uncertainty, but significant 
reduction will require the data from Run 2. 


2. High-luminosity lepton charge asymmetry from the Tevatron 


Forward-backward asymmetry (Ayh) distributions of charged leptons from inclusive weak boson production at the 
Tevatron are uniquely sensitive to the average slope of the ratio d{x,Q)/u{x,Q) at large x, of order 0.1 and above. 
In the CT14 analysis, we include several data sets of Ar h in easured at y/s = 1.8 and 1.96 TeV by the CDF and 

0 , 0 : 


DO Collaborations. The CDF Run-1 data set on Ayh 

Q 

is supplemented by the CDF Run-2 data set at 170 pb~^ [^. Ayh data at y/s = 1.96 TeV from DO in the electron 


which was instrumental in resolving conflicting 

information on the large-x behavior of m(x, Q) and d{x, Q) from contemporary fixed-target DIS experiments 


61 


i64|, 








Con'elation between CMS W asym. and CT14 PDFs at 100 GeV 



FIG. 2: The correlation cosine cos ifi 


13|| between the PDF f(x, Q 


100 GeV) at the specified x value on the horizontal 


and NNLO predictions for muon CMS charge asymmetry [4( 






















10 



FIG. 3: The correlation cosine cos (j) [1^ between the g-PDF at the specified x value on the horizontal axis and NLO predictions 
for the CDF (upper left panel), D0 (upper right panel), ATLAS (lower left panel) and CMS (lower right 
panel) inclusive jet cross sections at Q = 100 GeV. 
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CT14 NNLO u(x,100 GeV) 
CDF inclusive jet R=0.7, lyl <2.1 



CT14 NNLO u(x,100 GeV) 
DO inclusive jet R=0.7, lyl < 2.4 



CT14 NNLO u(x,100 GeV) 
ATLAS single inc. jet R=0.6, lyl < 4.4 


lO’'^ 10'^ 


lO’'^ 10'^ 


CT14 NNLO u(x,100 GeV) 
CMS single inc. jet R=0.7, lyl < 2.5 

0.5 0.9 


FIG. 4: The correlation cosine cos </> [1^ between the u-PDF at the specified x value on the horizontal axis and NLO predictions 
for the CDF (upper left panel), D0 (upper right panel), ATLAS (lower left panel) and CMS (lower right 
panel) inclusive jet cross sections at Q = 100 GeV. 
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H 


and muon 


42 | decay channels, for 9.7 fb ^ and 0.3 fb are also included. In all A^h data sets, we include 


subsamples with the cuts on the transverse momentum pTi of the final-state lepton specified in Table HU 

The electron data set (9.7 fb~^) from D0 that we now include replaces the 0.75 fb“^ counterpart set [23, first 
included in CTIO. This replacement has an important impact on the determination of the large-x quark PDFs; thus, 
these new A^h data sets are perhaps the most challenging and valuable among all that were added in CT14. 

The D0 Ach data have small experimental errors, and hence push the limits of the available theoretical calculations. 
Relatively small differences in the average slope (with respect to x) of the d/u ratio in the probed region can produce 
large variations in for the Tevatron charge asymmetry 6ll-l63 1. By varying the minimal selection cuts on pTi of 


the lepton, it is possible to probe subtle features of the large-x PDFs. For that, understanding of the transverse mo¬ 
mentum dependence in both experiment and theory is necessary, which demands evaluation of transverse momentum 
resummation effects. 

When the first Tevatron Run-2 A^n data sets were implemented in CT fits, significant tensions were discovered 
between the electron and muon channels, and even between different pTt bins within one decay channel. The tensions 


prompted a detailed study in the CTIO analysis Q]. The study found that various pri bins of the electron and muon 
asymmetries from D0 disagree with DIS experiments and among themselves. 

In light of these unresolved tensions, we published a CTIO PDF ensemble at NLO, which did not include the 
D0 Run-2 Ach data and yielded a d/u ratio that was close to that ratio in CTEQ6.6 NLO. An alternative CTIOW 
NLO ensemble was also constructed. It included four pTt bins of that data and predicted a harder d/u behavior at 

we took an in-between path and included only 


X ^ 1. When constructing the counterpart CTIO NNLO PDFs in 
the two most inclusive pTi bins, one from the electron 


22| . and one from the muon ^ samples. This choice still 


resulted in a larger d/u asymptotic value in CTIO NNLO than in CTEQ6.6. 

The new A^h data for 9.7 fb~^ in the electron channel is more compatible with the other global fit in the data that 
we included. Therefore, CT14 includes the D0 Ach measurement in the muon channel with pxt > 20 GeV 
in the electron channel with pTi > 25 GeV 


42 1 and 


[1^ . The replacement does not affect the general behavior of the PDFs, 
except that the GT14 d/u ratio at high x follows the trends of CTEQ6.6 NLO and CTIO NLO, rather than of CTIOW 
NLO and CTIO NNLO. 


3. New HERA data 


CT14 includes a combined HERA-1 data set of reduced cross sections for semi-inclusive DIS production of open 
charm j^, and measurements of the longitudinal structure function Fl{x, Q) in neutral-current DIS [^. The former 
replaces independent data sets of charm structure functions and reduced cross sections from HI and ZEUS 651-168 1. 


Using the combined HERA charm data set, we obtain a slightly smaller uncertainty on the gluon at a; < 0.01 and better 
constraints on charm mass than with independent sets (69|. The latter HERA data set, on F^, is not independent 
from the combined HERA set on inclusive DIS [^, but has only nine data points and does not significantly change 
the global Its utility is primarily to prevent unphysical solutions for the gluon PDF at small x at the stage of the 
PDF error analysis. 
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4- Other LHC results 


0 ir 


One class of LHC data that could potentially play a large role m in the determination of the gluon distribution, 
especially at high x, is the differential distributions of tt production, now available from ATLAS 591 and CMS 58, 0. 
However, these data are not included into our fit, as the differential NNLO tt cross section predictions for the LHC 

In addition, constraints on 


are not yet complete and the total cross section measurements lack statistical power. [^ . 
the PDFs from tt cross sections are mutually correlated with the values of QCD coupling and top quark mass. NLO 
electroweak corrections, playing an important role j^ . 10 for these data, are still unavailable for some tt kinematic 
distributions. Once these calculations are completed, they will be incorporated in future versions of CT PDFs. For 
now, we simply show predictions from CT14 for the ti distributions using the approximate NNLO calculations in 
Section Ivl 


C. Summary of theoretical calculations 


1. QCD cross sections 


The CT14 global analysis prioritizes the selection of published data for which NNLO predictions are available, 
and theoretical uncertainties of various kinds are well understood. Theoretical calculations for neutral-current DIS 
are based on the NNLO implementation Q of the S-ACOT-y factorization scheme BQ with massive quarks. For 
inclusive distributions in the low-mass Drell-Yan process, NNLO predictions are obtained with the program VRAP 
Predictions for WjZ production and weak boson charge asymmetries with pxt cuts are obtained with the 


75 


3 - 


n 


NNLL-(approx. NNLO) program ResBos [TTHSOj, as in the previous analyses. 

As already mentioned in the introduction, two exceptions from this general rule concern charged-current DIS and 
collider jet production. Both have unique sensitivities to crucial PDF combinations, but are still known only to NLO. 
The CCFR and NuTeV data on inclusive and semi-inclusive charge-current DIS are indispensable for constraining the 
strangeness PDF; single-inclusive jet production at the Tevatron and now at the LHC are essential for constraining 
the gluon distribution. Yet, in both categories, the experimental uncertainties are fairly large and arguably diminish 
the impact of missing NNLO effects. Given the importance of these measurements, our approach is then to include 
these data in our NNLO global PDF fits, but evaluate their matrix elements at NLO. 

According to this choice, we do not rely on the use of threshold resummmation techniques 


81 


81 


82l | to approximate 


the NNLO corrections in jet production. Nor do we remove the LHC jet data due to the kinematic limitations of such 
resummation techniques [83|. A large effort was invested in the CTIO and CT14 analyses to estimate the possibility 


of biases in the NNLO PDFs due to using NLO cross sections for jet production 


92 


93| . The sensitivity of the central 


PDFs and their uncertainty to plausible NNLO corrections was estimated with a variation of Cacciari-Houdeau’s 
method [3 - by introducing additional correlated systematic errors in jet production associated with the residual 
dependence on QCD scales and a potential missing contribution of a typical magnitude expected from an NNLO 
correction. These exercises produced two conclusions. First, the scale variation in the NLO jet cross section is 
reduced if the central renormalization and factorization scales are set equal to the transverse momentum pT of the 
individual jet in the data bin. This choice is adopted both for the LHC and Tevatron jet cross sections. In the 
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recently completed partial NNLO calculation for jets produced via gg scattering 


85 


, this scale choice leads to an 


NNLO/NLO K-factor that is both smaller than for the alternative scale equal to the leading jet’s pr, and is relatively 
constant over the range of the LHC jet measurements [87|. Second, the plausible effect of the residual QCD scale 


dependence at NLO can be estimated as a correlated uncertainty in the CTIO NNLO fit. Currently it has marginal 
effect on the central PDF fits and the PDF uncertainty. 

The CT14 analysis computes NLO cross sections for inclusive jet production with the help of FastNLO l88ll and 
ApplGrid Issl interfaces to NLOJETH— h 




90|,l9ll . A series of benchmarking exercises that we had completed [92 


verified that the fast interfaces are in good agreement among themselves and with an independent NLO calculation 
in the program MEKS 92|. Both ATLAS and CMS have measured the inclusive jet cross sections for two jet sizes. 


We use the larger of the two sizes (0.6 for ATLAS and 0.7 for CMS) to further reduce the importance of NNLO 
corrections. 


2. Figure-of-merit function 


In accord with the general procedure summarized in Ref. |6], the most probable solutions for CT14 PDFs are found 
by a minimization of the function 

t't exT) 


2 _ 
\global / . 


Xn 


Xth ■ 


( 1 ) 


n—1 


This function sums contributions Xn from Nexp fitted experiments and includes a contribution xih specifying theoreti¬ 
cal conditions (“L^range Multiplier constraints”) imposed on some PDF parameters. In turn, the Xn ^re constructed 
as in Eq. (14) of (6| and account for both uncorrelated and correlated experimental errors. Section 3 of that paper 
includes a detailed review of the statistical procedure that we continue to follow. Instead of repeating that review, 
we shall briefly remind the reader about the usage of the tolerance and quasi-Gaussian S variables when constructing 
the error PDFs. 

The minimum of the x'^iobai function is found iteratively by the method of steepest descent using the program MINUIT. 
The boundaries of the 90% C.L. region around the minimum of X^iobai’ eigenvector PDF sets quantifying the 

associated uncertainty, are found by iterative diagonalization of the Hessian matrix QQ . The 90% C.L. boundary 
in CT14 and CTIO analyses is determined according to two tiers of criteria, based on the increase in the global X%obai 
summed over all experiments, and on the agreement with individual experimental data sets 0,0,12. The first type 
of condition demands that the global does not increase above the best-fit value by more than = T^, where 
the 90% C.L. region corresponds toT k, 10. The second condition introduces a penalty term P, called Tier-2 penalty, 
in when establishing the confidence region, which quickly grows when the fit ceases to agree with any specific 
experiment within the 90% C.L. for that experiment. The effective function Xe// = X%obai + -P is scanned along each 
eigenvector direction until Xe// increases above the tolerance bound, or rapid Xe// growth due to the penalty P is 
triggered. 

The penalty term is constructed as 

^exp 

n—1 


( 2 ) 
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from the equivalent Gaussian variables Sn that obey an approximate standard normal distribution independently of 
the number of data points Npt^n in the experiment. Every Sn is a monotonically increasing function of the respective 


Xn given in [18. H- The power fc = 16 is chosen so that (S'n)^ sharply increases from zero when Sn approaches 1.3, 
the value corresponding to the 90% C.L. cutoff. The implementation of Sn is fully documented in the appendix of 

Ref. Q- 


3. Correlated systematic errors 


In many of the data sets included in the CT14 analysis, the reported correlated systematic errors from experimental 
sources dominate over the statistical errors. Care must therefore be taken in the treatment of these systematic errors 
to avoid artificial biases in the best-fit outcomes, such as the bias described by D’Agostini in 94. l95|. 

Our procedure for handling the systematic errors is reviewed in Secs. 3C and 6D of see also a related discussion 
in the appendices of 211 and j^ . The correlated errors for a given experiment, and effective shifts in the theory 


or data that they cause, are estimated in a linearized approximation by including a contribution in the figure-of- 
merit function proportional to the correlation matrix. A practical implementation of this approach runs into a 
dilemma of distinguishing between the additive and multiplicative correlated errors, which are often not separated 
in the experimental publications, but must follow different prescriptions to prevent the bias. It is the matrix of 
relative correlated errors that is typically published; the absolute correlated errors must be reconstructed from 
by following the prescription for either the additive or multiplicative type. 

In inclusive jet production, the choice between the additive and multiplicative treatments modifies the large-x 
behavior of the gluon PDF. This has been studied in the CTIO NNLO analysis, cf. Sec. 6D of ^j. In general, the 
dominant sources of systematic error, especially at the Tevatron and LHC, should be treated as multiplicative rather 
than additive; that is, by assuming that the relative systematic error corresponds to a fixed fraction of the theoretical 
value, and not of the central data value. The final CT14 PDFs were derived under this assumption, by treating the 
systematic errors as multiplicative in all experiments.* Of course, this is just one option on the table: alternative 
candidate fits of the CT14 family were also performed, by treating some correlated errors as additive. They produced 
the PDFs that generally lie within the quoted uncertainty ranges, as in the previous exercise documented in 


III. OVERVIEW OF CT14 PDFS AS FUNCTIONS OF x AND Q 

Figure [5] shows an overview of the CT14 parton distribution functions, for Q = 2 and 100 GeV. The function 
xf{x, Q) is plotted versus x, for flavors u, u, d, d, s = s, and g. We assume s{x, Qo) = s{x, Qo), since their difference is 
consistent with zero and has large uncertainty |96l | . The plots show the central fit to the global data listed in Tables U 
andUn corresponding to the lowest total for our choice of PDF parametrizations. 

The relative changes between the GTIO NNLO and GT14 NNLO ensembles are best visualized by comparing their 
PDF uncertainties. Fig. [6] compares the PDF error bands at 90% confidence level for the key flavors, with each band 


According to terminology adopted in Refs. nil, CT14 implements the correlated errors according to the “extended T” prescription 
for all experiments, i.e., by normalizing the relative correlated errors by the current theoretical value in each iteration of the fit. 
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CT14 NNLO 



X 


FIG. 5: The CT14 parton distribution functions at Q 
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X 

= 2 GeV and Q = 100 GeV for u, u, d,d,s = s, and g. 


normalized to the respective best-fit CT14 NNLO PDF. The blue solid and red dashed error bands are obtained for 
CT14 and CTIO NNLO PDFs at Q = 100 GeV, respectively. 

Focusing first on the u and d flavors in the upper four subfigures, we observe that the u and u PDFs have mildly 
increased in CT14 at a: < 10“^, while the d and d PDFs have become slightly smaller. These changes can be 
attributed to a more flexible parametrization form adopted in CT14, which modihes the SU{2) flavor composition of 
the first-generation PDFs at the smallest x values in the fit. 

The CT14 d-quark PDF has increased by 5% at a; ~ 0.05, after the ATLAS and CMS W/Z production data sets at 
7 TeV were included. At a: > 0.1, the update of the D0 charge asymmetry data set in the electron channel, reviewed 
in Sec. Ill B 21 has reduced the magnitude of the d quark PDFs by a large amount, and has moderately increased the 
u{x, Q) distribution. 

The u{x,Q) and d{x,Q) distributions are both slightly larger at x = 0.01 — 0.1 because of several factors. At 
X = 0.2 — 0.5, where there are only very weak constraints on the sea-quark PDFs, the new parametrization form of 
CT14 results in smaller values of u{x, Q) and larger values d{x, Q), as compared to CTIO, although for the most part 
within the combined PDF uncertainties of the two ensembles. 

The central strangeness PDF s{x,Q) in the third row of Fig. |6] has decreased for 0.01 < x < 0.15, but within 
the limits of the CTIO uncertainty, as a consequence of the more flexible parametrization, the corrected calculation 
for massive quarks in charged-current DIS, and the inclusion of the LHC data. The extrapolation of s(x, Q) below 
X = 0.01, where no data directly constrain it, also lies somewhat lower than before; its uncertainty remains large and 
compatible with that in CTIO. At large x, above about 0.2, the strange quark PDF is essentially unconstrained in 
CT14, just as in CTIO. 

The central gluon PDF (last frame of Fig. [6]) has increased in CT14 by 1-2% at x ~ 0.05 and has been somewhat 
modified at x > 0.1 by the inclusion of the LHC jet production, by the multiplicative treatment of correlated errors. 
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FIG. 6: A comparison of 90% C.L. PDF uncertainties from CT14 NNLO (solid blue) and CTIO NNLO (red dashed) error sets. 
Both error bands are normalized to the respective central CT14 NNLO PDFs. 


and by the other factors discussed above. For x between 0.1 to 0.5, the gluon PDF has increased in CT14 as compared 
to CTIO. 

Let us now review the ratios of various PDFs, starting with the ratio d/u shown in Fig. [T] The changes in d/u 
in CT14 NNLO, as compared to CTIO NNLO, can be summarized as a reduction of the central ratio at a; > 0.1, 
caused by the 9.7 fb~^ D0 charge asymmetry data, and an increased uncertainty at a; < 0.05 allowed by the new 
parametrization form. At a; > 0.2, the central CT14 NNLO ratio is lower than that of CTIO NNLO, while their 
relative PDF uncertainties remain about the same. This can be better seen from a direct comparison of the relative 
PDF uncertainties (normalized to their respective central PDFs) in the third inset. The collider charge asymmetry 
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X 


FIG. 7: A comparison of 90% C.L. uncertainties on the ratio d{x,Q)/u{x,Q) for CT14 NNLO (solid blue) and CTIO NNLO 
(dashed red), and CJ12 NLO (green lines) error ensembles. 




FIG. 8: A comparison of 90% G.L. uncertainties on the ratios d{x, Q)/u(x, Q) and [six, Q) + six, Q)) / («(*, Q) + dix, Q)), for 
GT14 NNLO (solid blue) and CTIO NNLO (red dashed) error ensembles. 
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data constrains d/u at x up to about 0.4. At even higher a;, outside of the experimental reach, the behavior of the 
CT14 PDFs reflects the parametrization form, which now allows d/u to approach any constant value at x —>■ 1. 

At such high x, the CTEQ-JLab analysis (CJ12) has independently determined the ratio d/u at NLO, by 
including the fixed-target DIS data at lower W and higher x that is excluded by a selection cut W > 3.5 GeV in 
CT14, and by considering higher-twist and nuclear effects that can be neglected in the kinematic range of CT14 data. 
The CT14 uncertainty band on d/u at NNLO lies for the most part between the CJmin and CJmax predictions at 
NLO that demarcate the CJ12 uncertainty, cf. the first inset of Fig. [71 We see that the CT14 predictions on d/u 
at X > 0.1, which were derived from high-energy measurements that are not affected by nuclear effects, fall within 
the CJ12 uncertainty range obtained from low-energy DIS with an estimate of various effects beyond leading-twist 
perturbative QCD. The ratio should be stable to inclusion of NNLO effects; thus, the two ensembles predict a similar 
trend for collider observables sensitive to d/u. 

Turning now to the ratios of sea quark PDFs in Fig. HI we observe that the uncertainty on d{x, Q)/u{x, Q) in the 
left inset has also increased at small x in CT14 NNLO. At x > 0.1, we assume that both u{x,Qq) and d{x,Qo) are 
proportional to (1 —x)“^ with the same power 02 ; the ratio d{x,Qo)/u{x,Qo) can thus approach a constant value that 
comes out to be close to 1 in the central fit, while the parametrization in CTIO forced it to vanish. The uncertainty 
on d/u has also increased across most of the x range. 

The overall reduction in the strangeness PDF at x > 0.01 leads to a smaller ratio of the strange-to-nonstrange sea 
quark PDFs, (s(x, Q) + s(x, Q)) / (u(x, Q) -\- J(x, Q)) , presented in the right inset of Fig. |8l At x < 0.01, this ratio is 
determined entirely by parametrization form and was found in CTIO to be consistent with the exact 5't/(3) symmetry 
of PDF flavors, (s(x, Q) + s(x, Q)) / (m(x, Q) -I- J(x, Q)) — 1 at x — >■ 0, albeit with a large uncertainty. The SU(3)- 
symmetric asymptotic solution at x —> 0 is still allowed in CT14 as a possibility, even though the asymptotic limit of 
the central CT14 NNLO has been reduced and is now at about 0.6 at x = 10“®. The uncertainty of strangeness has 
increased at such small x and now allows (s(x, Q) + s(x, Q)) / (m(x, Q) -\- d{x, Q)) between 0.35 and 2.5 at x = 10“®. 


IV. COMPARISONS WITH HADRONIC EXPERIMENTS 

A. Electroweak total cross sections at the LHC 

Measurements of total cross sections for production of massive electroweak particles at hadron colliders provide 
cornerstone benchmark tests of the Standard Model. These relatively simple observables can be both measured with 
high precision and predicted in NNLO QCD theory with small uncertainties. In this subsection, we collect NNLO 
theory predictions based on CT14 and CTIO NNLO PDFs for inclusive W and Z boson production, top-quark pair 
production, Higgs-boson production (through gluon-gluon fusion), at the LHC with center-of-mass energies of 8 and 
13 TeV. These theoretical predictions can be compared to the corresponding experimental measurements. We also 
examine correlations between PDF uncertainties of the total cross sections in the context of the Hessian formalism, 
following the approach summarized in Ref. [l^. PDF-driven correlations reveal relations between PDF uncertainties 
of QCD observables through their shared PDF parameters. 

The masses of the top quark and Higgs boson are set to = 173.3 GeV and niH = 125 GeV, respectively, in 

this work. The W and Z inclusive cross sections (multiplied by branching ratios for the decay into one charged lepton 
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FIG. 9: The CT14 and CTIO NNLO 90% C.L. error ellipses for the W and cross sections, at the LHC 8 and 13 TeV. 
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FIG. 10: CT14 and CTIO NNLO 90% C.L. error ellipses for Z and cross sections, at the LHC 8 and 13 TeV. 


flavor), are calculated by using the Vrap vO. 9 program 


0 


25|,l76l at NNLO in QCD, with the renormalization and 


factorization (fiji and fip) scales set equal to the invariant mass of the vector boso n. The total inclusive top-quark 


pair cross sections are calculated with the help of the program ToP-|—h v2.0 


99 


QCD scales set to the mass of the top quark. The Higg s boson cross sections via gluon-gluon fusion are calculated at 
NNLO in QCD by using the iHixs vl.3 program 10l|, in the heavy-quark effective theory (HQET) with finite top 


quark mass correction, and with the QCD scales set equal to the invariant mass of the Higgs boson. 

Figs. 1^1-[HI show central predictions and 90% C.L. regions for (H7+, W ), {tt,Z) and {tt,ggH) pairs of 

inclusive cross sections at the LHC 8 and 13 TeV. In each figure, two elliptical confidence regions are shown, obtained 
with either CT14 or CTIO NNLO PDFs. These can be used to read off PDF uncertainties and correlations for each 
pair of cross sections. For example. Figs. 1^ and [TUI indicate that the PDF induced uncertainties, at the 90% C.L., 
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FIG. 11: CT14 and CTIO NNLO 90% C.L. error ellipses for tt and Z cross sections, at the LHC 8 and 13 TeV. 
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FIG. 12: CT14 and CTIO NNLO 90% C.L. error ellipses for tt and ggH cross sections, at the LHC 8 and 13 TeV. 


are about 3.9%, 3.7%, and 3.7% for W~^, W~, and Z boson production at the LHC 13 TeV, respectively, with CT14 
NNLO PDFs. As compared to the results using CTIO NNLO PDFs, the ratio of the total inclusive cross sections of 
W~^ to W~ productions at the LHC 13 TeV is smaller by about one percent when using CT14 NNLO PDFs which 
also provide a slightly larger error (by about half percent) in that ratio. Specifically, the CT14 NNLO predictions 
of that ratio at the 68% C.L. are at LHC 8 TeV, and 1.35^i[^“ at LHC 13 TeV, respectively. The central 

predictions at 8 TeV are in agreement with the recent CMS measurements 102| . They also show that the electroweak 


gauge boson cross sections are highly correlated with each other; in fact, much of the uncertainty is driven in this 
case by the small-a; gluon [l^ . 

In Fig. El we observe a moderate anti-correlation between the top-quark pair and the Z boson production cross 
sections. This is a consequence of the proton momentum sum rule mediated by the gluon PDF [l^. In Fig. [T^l the 
Higgs boson cross section through gluon-gluon fusion does not have a pronounced correlation or anti-correlation with 
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the top-quark cross section, because they are dominated by the gluon PDF in different x regions. The Higgs boson 
and ti cross section predictions are further examined in Section El As a result of the changes in PDFs from CTIO 
to CT14, both the calculated Higgs boson and top-quark pair production cross sections have increased slightly, while 
the electroweak gauge boson cross sections have decreased. However, the changes of the central predictions are within 
the error ellipses of either CT14 or CTIO. 


B. LHC and Tevatron inclusive jet cross sections 

We now turn to the comparisons of CT14 PDFs with new LHC cross sections on inclusive jet production. We 
argued in Section [ill that PDF uncertainty of inclusive jet production at the LHC is strongly correlated with the gluon 
PDF in a wider range of x than in the counterpart measurements at the Tevatron. The true potential of LHC jets for 
constraining the gluon PDF also depends on experimental uncertainties, which we can now explore for the first time 
using the CMS and ATLAS data on inclusive jet cross sections at 7 TeV. 

We first note that, in the context of our analysis, the single-inclusive jet measurements at the LHC are found to 
be in reasonable consistency with the other global data, including Tevatron Run-2 single-inclusive jet cross sections 
measured by the CDF and D0 collaborations. The values of for the four jet experiments (ID=504, 514, 535, and 
538) are listed at the end of Table llll We obtain very good fits (x^/Apt =1.09 and 0.55) to the D0 and ATLAS jet 
data sets, and moderately worse fits =1.45 and 1.33) to the CDF and CMS data sets. The description of 

the Tevatron jet data sets has been examined as a part of the CTIO NNLO study Q], where it was pointed out that 
the x^ for the CDF Run-2 measurement tends to be increased by random, rather than systematic, fluctuations of the 
data. In regards to describing the Tevatron jet data, the CT14 NNLO PDFs follow similar trends as CTIO NNLO. 


1. CMS single-inclusive jet cross sections 


Figure [13] shows a compariso n between the measurements for the CMS inclusive jet data at 7 TeV and NLO 


theory prediction 


88 , 


103 


104| utilizing CT14 NNLO PDFs. We discussed earlier in the paper that the missing 


NNLO contributions to the hard-scattering cross section can be anticipated to be small under our QCD scale choices, 
compared to the experimental uncertainty. 

The CMS data, with 5 fb“^ of integrated luminosity, employ the anti-ZcT jet algorithm |l05| with jet radius R = 0.7. 
The measurements are divided into 5 bins of rapidity and presented as a function of the pt of the jet, with a total of 
133 data points. The theoretical prediction based on the CT14 NNLO PDFs reproduces the behavior of experimental 
cross sections across thirteen orders of magnitude. 

Fig. [T4| provides a more detailed look at these distributions, by plotting the shifted central data values divided by 
the theory. The data are shifted by optimal amounts based on the treatment of the systematic errors as nuisance 
parameters, cf. Ref. Q. The error bars for the shifted data include only uncorrelated errors, i.e. statistical and 
uncorrelated systematic errors added in quadrature. Here we notice moderate differences (up to a few tens of percent 
of the central prediction) between theory and shifted data, which elevate x^ for this data set by about 2.5 standard 
deviations for the central CT14 PDF set, or less for the error PDF sets. 

Although they are not statistically significant, the origin of these mild discrepancies can be further explored by 
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FIG. 13: Comparison of data and theory for the CMS 7 TeV inclusive jet production, for CT14 NNLO PDFs. Measurements of 
cPa/dprdy for 5 rapidity bins are plotted as functions of jet pT- The points are data with total experimental errors, obtained 
by adding the statistical and systematic errors in quadrature. The bands are theoretical calculations with 68% C.L. PDF 
uncertainties. 


studying the correlated shifts allowed by the systematic uncertainties. In our implementation of systematic errors 
Q], each correlated uncertainty a is associated with a normally distributed random nuisance parameter A^. When 
Aq ^ 0, it may effectively shift the central value of the data point i in the fit by 




where ai^a is the published fractional l-tr uncertainty of data point i due to systematic error a. Xi is the cross section 
value that normalizes the fractional systematic uncertainty [^, set equal to the theoretical value Ti in the procedure 
of the current analysis. 

Each Aq is adjusted to optimize the agreement between theory and data. Fig.[T5]shows a histogram of the best-fit Aq 
for the 19 sources of the systematic errors published by CMS [^. In an ideal situation, the optimized {Aq : o = 1...19} 


would be normally distributed with a mean value of 0 and standard deviation of 1. The actual distribution of the 
Aq values in Fig. [15] appears to be somewhat narrower than the standard normal one. This and relatively high 
X^/Npt = 1.33 may indicate that either uncorrelated systematic uncertainties are underestimated, or higher-order 
theoretical calculations are needed to describe the data. 


2. ATLAS single-inclusive jet cross sections 


Equivalent comparisons for the ATLAS 7 TeV inclusive jet production with 37 pb~^ of integrated luminosity 
are presented in Figs.|T6|-|T8| In this case, we compare to data in 7 bins of rapidity for the anti-fer jet algorithm 105 | 
with jet radius R = 0.6. The agreement is excellent in all ftgures, not the least because both statistical and systematic 
errors are still large in this early data set. Among 119 sources of experimental errors that were identified, many have 
little impact on the best fit. The resulting distribution of the nuisance parameters in Fig. [T8|at the best fit is much 
narrower than the ideal Gaussian distribution, indicating that most of the correlated sources need not deviate from 
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FIG. 14: Same as Fig. 1131 shown as the ratio of shifted data for CMS 7 TeV divided by theory. The error bars correspond to 
total uncorrelated errors. The shaded region shows the 68% C.L. PDF uncertainties. 


their nominal values when the PDFs are fitted. 

To summarize, Figs. [T3HT81 demonstrate that CT14 PDFs agree with both sets of CMS and ATLAS single-inclusive 
jet cross sections. The ATLAS collaboration also measured inclusive jet production at center of mass energy y/s = 
2.76 TeV and published ratios between the 2.76 and 7 TeV measurements in Ref. 57|. These two measurements 
are well described by the theory prediction using CT14, with a /Npt k. 1. Furthermore, the ATLAS collaboration 
published the inclusive jet measurements using another choice of jet radius of 0.4 
collaborations measured cross sections for dijet production 
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Ml. Both ATLAS and CMS 


52| based on the same data sample of the single-inclusive 


jet measurements. These measurements are not included in the CT14 global analysis because of the correlations 
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Best-fit nuisance parameters 


FIG. 15: Histogram of optimized nuisance parameters Aq, for the sources of correlated systematic errors of the CMS 7 TeV 
inclusive jet production. The curve is the standard normal distribution expected in the ideal case. 



FIG. 16: Gomparison of data on (fia/dprdy and NLO theory for the ATLAS 7 TeV inclusive jet production, using CT14 
NNLO PDFs. 


between the two (di-jet and single-inclusive jet) data sets. However, it has been verified that the CT14 analysis gives 
a good description for all these data sets as well. 


C. Differential cross sections for lepton pair prodnction at the LHC 

1. Charged lepton pseudorapidity distributions in W/Z boson produetion 

Differential cross sections for production of massive vector bosons set important constraints on the flavor composition 
of the proton, notably on the u and d quarks, anti-quarks and their ratios. Figure [12] compares CT14 NNLO theoretical 
predictions with pseudorapidity {\t]£\) distributions of charged leptons from inclusive and production and decay 
in the 2010 ATLAS 7 TeV data sample with 33-36 pb~^ of integrated luminosity [^. Theoretical predictions are 
computed using the program ResBos. The black data points represent the unshifted central values of the data. The 
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FIG. 17: Same as Fig. 1161 shown as the ratio of shifted data for ATLAS 7 TeV divided by the NLO theory. The error bars 
correspond to total uncorrelated errors. The shaded region shows the 68% C.L. PDF uncertainties. 


ATLAS inclusive iet production at 7 TeV, R=0.6 



FIG. 18: Histogram of optimized nuisance parameters Aq for the sources of correlated systematic errors of the ATLAS 7 TeV 
inclusive jet production. 
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FIG. 19: Comparison between the 2010 ATLAS measurements [4^ of the charged-lepton pseudorapidity and Z boson 
rapidity distributions at y'i = 7 TeV, and the ResBos theory using CT14 NNLO PDFs. 


error bars indicate the total (statistical-l-systematic) experimental error. The blue band is the CT14 PDF uncertainty 
evaluated at the 68% C.L. These three measurements share correlated systematic errors. From the figures, we see 
that the data are described well by theory over the entire rapidity range, even in the absence of correlated systematic 
shifts. The PDF uncertainties are similar in their size to those of the experimental measurements and, overall, the 
theory predictions are within one standard deviation of the data. 


2. Influence of W boson charge asymmetry measurements at the LHC 


Another handy observable for determining the parton distribution functions is the charge asymmetry for FF+ and 
W~ bosons produced in pp or pp collisions. This process has been measured both at the Tevatron and at the LHC. 
As the asymmetry involves a ratio of the cross sections, many experimental systematic errors cancel, leading to very 
precise results. Without these collider data, the main information about the difference between the light flavors, 
d, d and u, u, would come from the BCDMS and NMC experiments, which are measurements of muon deep-inelastic 
scattering on proton and deuteron targets. Under the assumption of charge symmetry between the nucleons, the 
difference of the proton and deuteron cross sections distinguishes between the u and d PDFs in a nucleon. However, 
the deuteron measurem ents are subject to nuclear binding corrections, which have been estimated by introducing 
nuclear models 


97 


M 


106l| . but are not calculable from first principles. In contrast, the charge asymmetry 


data from the Tevatron and LHC colliders directly provide information about the difference between d and u flavors. 
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FIG. 20: charge asymmetry as a function of lepton pseudorapidity measured by the ATLAS Collaboration, compared to 

the 68% C.L. CT14 NNLO uncertainty band. The kinematic requirements are pxe > 20 GeV, pti^^ > 25 GeV and > 40 
GeV. 
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FIG. 21: Charge asymmetry of decay muons and electrons from production measured by the CMS experiment. The data 
values have pre > 25 or 35 GeV for the muon data and pre > 35 GeV for the electron data. The vertical error bars on the data 
points indicate total (statistical and systematic) uncertainties. The curve shows the CT14 theoretical calculation; the shaded 
region is the PDF uncertainty at 68% C.L. 


without the need for nuclear corrections. By including the ATLAS and CMS charge asymmetry data, we are able to 
obtain, for the first time, direct experimental constraints on the differences of the quark and antiquark PDFs for u 
and d flavors at x ~ 0.02 typical for the 7 TeV kinematics. 

Figure [201 shows a comparison of data and theory, for the lepton charge asymmetry of inclusive production. 
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FIG. 22: Charge asymmetry of decay muons from production measured by the LHCb experiment. 


from the ATLAS experiment at the LHC 7 TeV [^. These asymmetry data are correlated with the WjZ rapidity 
measurements discussed in the previous subsection; all four WjZ data sets are included in the CT14 global analysis 
using a shared correlation matrix from the ATLAS publication [3 |. The measurement was carried out with several 
kinematic cuts. The lepton transverse momentum was required to be greater than 20 GeV, the missing transverse 
energy to be greater than 25 GeV, and the lepton-neutrino transverse mass to be greater than 40 GeV. The shaded 
region is the PDF uncertainty of CT14 NNLO at 68% C.L. Again the points with error bars represent the unshifted 
data with the experimental errors added in quadrature. The data fluctuate around the GT14 predictions and are 
described well by the CT14 error band. 

Figure [211 presents a similar c omp arison of the unshifted data and CT14 NNLO theory for the charge asymmetry of 


decay muons H and electrons from inclusive production from the CMS experiment at the LHC 7 TeV. The 
asymmetry for muons is measured with 4.7 fb~^ of integrated luminosity, with pxe > 25 and 35 GeV; the asymmetry 
for electrons is measured with 840 pb“^ and pre > 35 GeV. Here we note that the CMS measurement does not apply 
a missing Et cut to Ach-, contrary to the counterpart ATLAS Ach measurement. Theory predictions are the same 
for both the muon and electron channels with the same cuts. The muon and electron data are consistent with one 
another, but the muon data have smaller statistical and systematic uncertainties, as is apparent in Fig. 1211 All three 
subsets of CMS Ach agree with predictions using CT14; their is further improved by optimizing the correlated 
shifts. The electron data and the muon data with the pxe cut of 35 GeV are included in the CT14 global analysis. 
The muon data with a pre cut of 25 GeV are not included in the CT14 analysis, but nevertheless are well described. 

In the LHCb measurement of the charged lepton asymmetry at 7 TeV [^ . the muons are required to have a 
transverse momentum greater than 20 GeV. The corresponding comparison of the CT14 NNLO predictions to the 
LHCb Ach data is shown in Fig. [22l The LHCb case is especially interesting, as the LHCb acceptance for charged 
leptons extends beyond the rapidity range measured by ATLAS and CMS. Thus, the LHCb results are sensitive to 
the u and d quark PDFs at larger x values than at the ATLAS or CMS. Good agreement between data and theory 
is again observed. 
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FIG. 23: Invariant mass distributions of Drell-Yan pairs in the high-mass region by ATLAS 7 TeV 108 1. with superimposed 
NNLO predictions based on CT14 NNLO PDFs. The left subfigure shows the differential cross sections as a function of the 
dilepton mass rries- The right subfigure shows the ratio of ATLAS shifted data to CT14 theory predictions. 
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FIG. 24: Same as in Fig. 1231 for ATLAS 7 TeV differential 
low-mass regions lOE |. 


distributions of Drell-Yan pairs in the low-mass and extended 


3. Production of Drell-Yan pairs at ATLAS 


In Figs. [53] and [231 we com pare CT14 NNLO p redic tions to ATLAS 7 TeV measurements of differential cross sections 
for production of high-mass 




lOSj l and low-mass [l09l| Drell-Yan pairs, plotted as a function of dilepton invariant mass 


mu ■ The experimental cross sections correspond to the “electroweak Born level”, unfolded from the raw data by 
correcting for electroweak final-state radiation. The high-mass data sample corresponds to 116 < mu < 1000 GeV. 
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At low dilepton masses, we compare to the combined electron+muon sample at 26 < mu < 66 GeV for L = 1.6 fb~^ 
in the upper row, as well as to the muon sample at 12 < mu < 66 GeV for L = 35 pb~^ in the lower row. Fiducial 
acceptance cuts on the decay leptons are specified inside the figures. Gorrelated experimental uncertainties are 
included in the comparison. 

O n the theory side, the cross sections are calculated at NNLO in QGD with ApplGrid interface to FEWZ 


110l - lll3l| . and including photon-scattering contribntions. Experimental uncertainties in these cross sections tend to 


be larger than the PDF uncertainties, as illustrated by the figures, hence we only compare these data to the GT14 
predictions a posteriori, without actually including them in the GT14 fit. 

It can be observed in the figures that CT14 NNLO PDFs agree well with the high-mass and low-mass data samples 
both in terms of the cross sections (in the left subfigures) and ratios of the shifted data to theoretical predictions 
(right subfigures). The PDF uncertainty bands, indicated by light-blue color, approximate the average behavior of 
the experimental data without systematic discrepancies. 

D. charge asymmetry from the D0 experiment at the Tevatron 


We reviewed above that, historically, measurements of charge asymmetry at the Tevatron have been important 
in the CTEQ-TEA global analysis. For example, the GTEQ6 PDFs (circa 2002) and CTIO PDFs (circa 2010-2012) 
included the asymmetry data from the GDF and D0 experiments to supplement the constraints on u and d quark 
PDFs at a; > 0.1 from fixed-target DIS experiments. The charge asymmetry at the Tevatron probes the differences 
of the slope in x of the PDFs for u and d flavors. 

A new charge asymmetry measurement from the D0 experiment at the Tevatron has recently been published, 
using the full integrated luminosity (9.7 fb~^) from Run-2 [^. The experimental uncertainties, both statistical and 
systematic, are smaller than in the previous Act measurement Figure [25] compares the D0 Run 2 data and 
various theoretical predictions at NNLO for both the latest (left) and the previous D0 data set (right). We show the 
unshifted data with the total experimental errors as error bars, and the 68% C.L. PDF uncertainties as the shaded 
regions. As an alternative representation, Figure |26| shows the differences between theory and shifted data, where the 
error bars represent the uncorrelated experimental errors. From the two figures, we conclude that it is difficult to fit 
both data sets well, given the smallness of the systematic shifts associated with Ach- While the 9.7 fb~^ electron data 
set is in better agreement with the global data, including the D0 muon 421 and GDF 0 Ach measurements, the 
best-fit X^/^pt for the 9.7 fb“^ sample remains relatively high (about 2) and is sensitive to detailed implementation 
of NNLO corrections. In-depth studies on the D0 asymmetry data will be presented in a forthcoming paper. When 
the high-luminosity D0 Ach measurement was substituted for the low-luminosity one, we observed reduction in the 
d/u ratio at a; > 0.1 compared to CTIOW NLO and CTIO NNLO sets. 

In total, constraints from the LHC and Tevatron WjZ differential cross sections and asymmetries lead to important 
changes in the quark sector PDFs, as documented in Sec. IIIII At a; < 0.02, we obtain more realistic error bands for 
the M, u, d, d PDFs upon including the ATLAS and CMS data sets. At a; > 0.1, the high-luminosity D0 charge 
asymmetry and other compatible experiments predict a softer behavior of d{x,Q)/u{x,Q) than in CTIOW. 
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FIG. 25: Charge asymmetry of decay electrons from production measured by the D0 experiment in Run-2 at the 
Tevatron with high (left) and low (right) luminosities, compared to several generations of CTEQ-TEA PDFs. 




FIG. 26: Same as Fig. 1251 plotted as the difference between theory and shifted data for Ach from D0 Run-2 (9.7 fb ^). 


E. Constraints on strangeness PDF from CCFR, NuTeV, and LHC experiments 


Let us now turn to the strangeness PDF s(x,Q), which has become smaller at a: > 0.05 in CT14 compared to 
our previous analyses, CTIO and CTEQ6.6. Although the CTI4 central s(x, Q) lies within the error bands of either 
earlier PDF set, it is important to verify that it is consistent with the four fixed-target measurements that are known 
to be sensitive to s{x,Q): namely, measurements of dimuon production in neutrino and antineutrino collisions with 
iron targets, from the CCFR 311 and NuTeV 0 collaborations (ID=124-127). 


Predictions using previous CTEQ PDFs were in agreement with these four experiments. In Table U for CT14, 
the four corresponding values are also good. Supporting evidence comes from the point-by-point comparisons in 
Figs. [27l and [28l between the theoretical cross sections for CT14 NNLO PDFs and the dimuon data from the NuTeV 
experiment in neutrino and antineutrino scattering. The analogous comparisons for the CCFR experiment are in 
Figs. [29l and [30l Given the size of the measurement errors and of the PDF uncertainty, it is clear that CT14 central 
predictions provide a good description of the dimuon cross sections. Also, our estimate for the uncertainty of the 
strange PDF looks reasonable: it is comparable to the measurement errors for these cross sections, which are known 
to be sensitive mostly to the strange quark PDF. 
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FIG. 27: Comparison of data and theory for the NuTeV measurements of dimuon production in neutrino-iron collisions. The 
data are expressed in the form of d?crfdxdy and shown as a function of x for a certain y and neutrino energy. 
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FIG. 28: Same as Fig. 1271 for the NuTeV measurements of dimuon production in antineutrino-iron collisions. 
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CCFR Neutrino, d^o(v^N -> X)/dxdy [pb/(GeV)] 









FIG. 29: Same as Fig. 1271 for the CCFR measurements of dimuon production in neutrino-iron collisions. 
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FIG. 30; Same as Fig. 1271 for the CCFR measurements of dimuon production in antineutrino-iron collisions. 
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Nevertheless, the CT14 central strangeness PDF lies on the lower side of the CTIO PDF uncertainty in some 
kinematic ranges. As mentioned in the introduction, the reduction is in part attributable to elimination of a com¬ 
putational error (wrong sign of a term) in the treatment of heavy-quark mass effects in charged-current DIS in 
post-CTEQ6.5 analyses, and in part from other sources, especially introduction of the LHC WjZ data, and more 
flexible parameterizations for all PDF flavors. 

The ATLAS and CMS experimental collaborations have recently published studies on the strangeness content of 
the prot on a nd have come to s omew hat discrepant conclusions. On the ATLAS side, two papers were published, one 


in 2012 [liol . and one in 2014 |ll4| . In the 2012 study, the inclusive DIS and inclusive and Z boson production 


measurements 


48j were employed to determine the s trang eness fraction of the proton for one value of (x, Q'). In the 


2014 study, the ATLAS 7 TeV W -I- c-jet, W -I- 114 1, and inclusive jZ cross sections were used. These two 

analyses determined the ratio (r®) of strange to down-sea quark PDF, 




at x = 0.023, Q = 1.4 GeV. 


( 3 ) 


They find 


_ 1 nn+O-25 


ATLAS (2012), 


r" = 0.96l°J® ATLAS (2014), 


( 4 ) 


which imply a rather large strangeness density. 
In 2014, the CMS collaboration 


46l | determined the ratios 


and 


= at x = 0.023, Q = 1.4 GeV 

d 


^ /o a:[s(x,g^)-ks(x,Q^)] dx 


( 5 ) 


( 6 ) 


fg X [u{x, Q'^) + d{x, Q"^)] dx 

by using inclusive DIS, th e cha rge asymmetry of decay muons from production |46| . and W + charm production 


differential cross sections 


115 1 at 7 TeV. They obtain 


= 0.65t°;i?, 
k®(Q 2 = 20 GeV2)= 0.521 


0.15 


GMS (2014). 


( 7 ) 


Notice that ATLAS and CMS use two different definitions, and i?'*, for the strangeness fraction, which are supposed 
to coincide at the initial scale Qo = 1-4 GeV, if u{x, Qo) = d{x, Qo)l 

For comparison, at the factorization scale Q = 1.4 GeV and x = 0.0234, the CT14 and CTIO predictions are 

s{x,Q) 


^CT14NNLO — 
^CTIONNLO = 


d{x, Q) 
s{x, Q) 
d{x, Q) 


= 0.53 ±0.20, 
= 0.76 ±0.17. 


( 8 ) 


t Both ATLAS and CMS studies are performed in the HERAFitter framework [l07l| and assume SU(2)-symmetric sea quark PDF 
parametrizations at the initial scale Qo =1.4 GeV. 
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Both CT14 and CTIO indicate a smaller strangeness than the ATLAS result and are compatible with CMS; the r® 
ratio is smaller for CT14 than for CTIO. 

The NOMA D Co llaboration has also completed a study of the strange quark PDF, relying on iz + Fe iJ.~^ + iJ.~ +X 

measurements [ml at lower energies than NuTeV and CCFR. They find that the strangeness suppression factor is 

At®(20 GeV^) = 0.591 ±0.019, (9) 


also yielding a sm aller strangeness density than the ATLAS result. In another recent study by S. Alekhin and 
collaborators [llst . the strange quark distribution and the ratios r® and k® were determined in a QCD analysis 
including the NuTeV, CCFR, NOMAD and CHORUS measurements. The study uses the fixed-flavor-number (FFN) 
scheme for the heavy-flavor treatment. Their main result is k®( 20 GeV^) = 0.654 ± 0.030. The CT14 and CTIO 
predictions for this quantity are 


^CT14NNLO — 0.62 ±0.14, 

^CTIONNLO = 0.73 ±0.11. (10) 

The CT14 calculation is consistent with the NOMAD central value. However, the CT14 PDF uncertainty is consider¬ 
ably larger than the uncertainty quoted in the NOMAD paper, partly because of a different convention for the PDF 
uncertainty. 


F. The CMS W + c production measurement 


Another experimental measurement that has direct access to the strange quark distribution is the associated pro¬ 
duction of W boson and charm quark at the LHC. Such a me asurement was reported by the CMS collaboration for 
^/s = 7TeV and a total integrated luminosity of 5 fb“^ 115|. Cross sections and ratios of cross sections with the 


observed IF+ and W~ bosons were measured differentially with respect to the absolute value of the pseudorapidity of 
the charged lepton from the W boson decay. As the theoretical cross section is not yet known at NNL O, the data were 
not directly included in the global fit, but are compared here to NLO calculations based on MCFM 6.0 119j, assuming 


a non-zero charm quark mass, and excluding contributions from gluon splitting into a cc pair. The renormalization 
and factorization scales are set to the virtuality of the W boson. The transverse momentum of the charged lepton is 
required to be at least 25 GeV. The theoretical calculation applies the same kinematical cuts as in the experimental 
analysis, but at the parton level. 

The left panel of Fig. |3T] shows the pseudorapidity distribution of the decay charged lepton from W boson decay 
in ± c production at 7 TeV. The format of the figures is the same as in the previous comparisons. The total 
experimental errors in the figures are reasonably close to the 68% C.L. PDF uncertainties. With further experimental 
and theoretical improvements, the process may contribute to the reduction of the PDF uncertainty. 

The right panel shows the ratio of charged lepton rapidity distributions in W~^ ± c and W~ ± c production, which 
provides a handle on the strangeness asymmetry, s — s. The CT14 parametrization allows for no intrinsic s-asymmetry 
at the initial scale Qo- (At higher scales, a tiny asymmetry is generated by 3-loop DGLAP evolution.) Our prediction 
reproduces the average trend of the data, however, the experimental errors are larger than the PDF uncertainties. 
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FIG. 31: Comparison of the CT14 predictions to + c differential cross sections (left) and to the ratio of W~^ + c to IF -(- c 
cross sections (right) from the CMS measurement at 7 TeV. 




FIG. 32: Correlation cosines between the PDFs of select flavors, the W^ + c cross section, and the W'^/W cross section ratio, 
as a function of x in the PDF. 

The specific x ranges that are probed by CMS + c cross sections can be identified by plotting correlation 
cosines [l^ between the PDFs of various flavors, the + c cross section, or cross section ratios. Fig. 1321 shows such 
correlation cosines for the s quark, gluon, and d quark PDFs at the factorization scale of 100 GeV. Lines in darker 
colors correspond to bins with larger rapidities. In the case of the differential cross section, the PDF correlations are 
most significant for the strange quark distributions at x = 0.01 — 0.1, as indicated by their strong correlations with 
coscj) ~ 1. The gluon does not play a significant role, due to its relatively smaller uncertainty in the same x region. In 
the case of the cross section ratio, the correlation with the strangeness is still dominant. But also, at large rapidity, 
the d quark contribution to the W~ cross sections is mildly anti-correlated, indicating that the ratio has marginal 
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FIG. 33: Correlation cosines between the ratio of total cross sections for production and Z boson production, and s quark 
PDF from CT14 and CTIO NNLO sets, for LHC 7 and 13 TeV. 


sensitivity to d(x, Q) at x around 0.01. 

Another well-known probe of the strangeness content of the proton is provided by the ratio of total cross sections 


13l |. The correlation cosine between (7(W^)/a(Z) and s(x,Q) can be viewed 


for LHC and Z boson production 
in Fig. [331 As expected, we observe strong anti-correlation in a certain x range at all LHC center-of-mass energies. 
Compared to CTIO, the x region of the strongest sensitivity shifts to higher x, and the x dependence gets flatter in 
CT14. 


V. IMPACT ON HIGGS BOSON AND tt CROSS SECTIONS AT THE LHC 


Gluon fusion provides the largest cross section for production of a Higgs boson. It was the most important process 
for the discovery of the Higgs boson in 2012, and it continues to be essential for detailed studies of Higgs boson 
properties. A great deal of benchmarking of har d-cross s ections and PDFs for the gg initial state was carried out 


both before and after the discovery 


2C 


92 


93 


121 


123j. This was motivated in part by the fact that the PDF 


uncertainty for the gg initial state was comparable to the renormalization and factorization scale uncertainties in the 
theoretical cross section at NNLO for producing a Higgs boson through gluon fusion. The recent calculation of the 
gluon fusion process at NNNLO has reduced the scale uncertainty in the hard cross section still further, making 
the PDF uncertainty even more critical. 

Similarly, production of a ti final state is crucial to many analyses at the LHC, as both a standard model signal and 
as a background to new physics. By far the dominant subprocess for tt productionthe LHC is gg —>■ tt, making 
tt production an important benchmark for understanding the gg PDF lum inosi ty 
calculation of the tt total inclusive cross section now available at NNLO 


991 [l3- 


13l |. especially with the current 


Using CTIO PDFs, we have recently performed detailed analyses of the predictions for q q —^ H and tt cross sections, 


as well as their uncertainties from both the PDFs and the strong coupling as 


36, 


124l |. In this section we update 
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these studies and review CT14 predictions for gg ^ H and tt total and differential cross sections. 


A. Higgs boson from gluon fusion at the LHC 


We begin wi th an analysis of the PDF and as uncertainties for gg For this, we have utilized the NNLO 

code iHixs 1.3 10l| . choosing the Higgs boson mass to be Mh = 125 GeV, and with both the renormalization and 


factorization scales fixed at /i = Mh- Here, we have included the finite top quark mass correction (about 7%) to the 
fixed-order NNLO result obtained using the HQET (with infinite top quark mass approximation). 

To calculate the 90% C.L. PDF and as uncertainties of an arbitrary cross section X according to the most con¬ 
ventional (Hessian) method [l^, we provide error PDFs (56 in the case of CT14) to probe independent combinations 
of the PDF parameters for the central as{Mz) = 0.118, plus two additional PDFs obtained from the best fits with 
as{Mz) = 0.116 and 0.120. Using the error sets , the combined PDF-fag uncertainty on X is estimated by adding 
the PDF and Us uncertainties in quadrature 125| . The quadrature-based combination is exact if has a quadratic 


dependence, and X has a linear dependence on the PDF fitting parameters in the vicinity of the best fit. To account 
for some mild nonlinearities, asymmetric errors are allowed in the positive and negative directions of each eigenvector 
in the fitting parameter space. 


Another method for estimating the PDF and as uncertainties on X introduces Lagrange multipliers (LM) 17|. It 
does not rely on any assumptions about the functional dependence of X on the PDF parameters. Instead, the PDFs 
are refitted a number of times, while fixing X to take some user-selected value in each fit. Then the uncertainty in X 
can be estimated by looking at how in the series of fits varies depending on the input value of X. The downside of 
the LM method is that it requires to repeat the PDF fit many times in order to calculate the uncertainty of each given 
observable. It is clearly impractical for general-purpose experimental analyses; however, it can be straightforwardly 
performed for a few selected observables. As a side benefit, the LM method also provides an easy way to see which 
experimental data sets in the PDF global analysis have the most impact on the PDF dependence of X. Thus, in 
this section we will perform both the LM and Hessian analyses of the uncertainties for the Higgs boson and ti cross 
sections at the LHC. 

We first do these calculations while keeping the strong coupling fixed at its central value of as{Mz) = 0.118 
recommended by the PDF4LHC group. The uncertainties obtained this way are purely due to the PDFs. The results 
of the LM analysis are illustrated by Fig. [Ml where we plot the change Ay^ in y^ as a function of the tentative cross 
section an for Higgs boson production via gluon fusion in pp collisions at energies y/s = 8 and 13 TeV. Ay^ = 0 
corresponds to the best-fit PDFs to the CT14 experimental data set, so that the minimum of the approximately 
parabolic curves is at our best-fit prediction for an- Non-zero Ay^ are obtained with an extra constraint that 
enforces an to take the values on the horizontal axis that deviate from the best-fit ones. We have plotted the changes 
of both the simple y^ (solid) and the y^-|-Tier-2 penalty (dashed), in order to see the effects of requiring that no 
particular data set is too badly fit in the global analysis. (As defined in the Appendix of Ref. Q and in 0], the 
Tier-2 penalty makes use of the variable Sm which gives a measure of the goodness-of-fit for each individual data 
set. A large means that the experiment is not consistent with the theory.) We see that the two curves are almost 
identical over much of the range plotted, only beginning to diverge when an is far from the best-fit value, and one or 
more experimental data sets can no longer be satisfactorily fit. 
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FIG. 34: Dependence of the increase in in the constrained CT14 fit on the expected cross section <jh at the LHC 8 and 
13 TeV, for Qs(Mz) = 0.118. The solid and dashed curves are for the constrained fits without and with the Tier-2 penalties, 
respectively. The red dots correspond to the upper and lower 90% C.L. limits calculated by the Hessian method. 


We can estimate asymmetric errors {6aH)± at the 90% C.L. by allowing a tolerance = T^, with T of about 10. 
Given the nearly parabolic nature of these plots, we see that the 68% C.L. errors can be consistently defined using a 
range corresponding to = (r/1.645)^. The 90% C.L. and 68% C.L. tolerance values are indicated by the upper 
and lower horizontal lines, respectively, in each of the plots. Finally, the red dots are the upper and lower 90% C.L. 
limits from the Hessian method analysis. They agree quite well with the LM analysis using the x^-|-Tier-2 penalty at 
both 8 and 13 TeV. The effect of the Tier-2 penalty is modest, the deviations from the parabolic behavior are small. 

We next perform a LM scan by allowing both the (Jh cross section and as{Mz) to vary as “fitting parameters”, 
and by including the world-average constraints on as{Mz) directly into the function. (Details can be obtained in 


Ref. [36|.) We examine y^ as a function of {as{Mz),crH) and trace out contours of constant y^-|-Tier-2 penalty in 
the {as, an) plane in Fig.[35l for y/s = 8 and 13 TeV. 

A contour here is the locus of points in the (as,cr//) plane along which the constrained value of y^-|-Tier-2 is 
constant. We see from Fig. [35] that the values of an and as{Mz) are strongly correlated, as expected, since the gg 
fusion cross section is proportional to as{Mz)‘^. Larger values of as{Mz) correspond to larger values of an for the 
same goodness-of-fit to the global data, even though there is a partially compensating decrease of the gg luminosity. 
The effect of the Tier-2 penalty is very small, being most noticeable for values of as around its global average of 0.118, 
which results in a squeezing of the ellipses in that region. 

Table ITTTl recapitulates the results from Figs. |34|and|35|by listing the central values of an, the PDF uncertainties, 
and combined PDF -|- ag uncertainties as obtained by the Hessian and LM methods. Here, the PDF -|- as uncertainty 
at 68% C.L. is obtained from the result at 90% C.L by a scaling factor of 1/1.645. 

The gg PDF luminosities for CT14, MMH T201 4 126| and NNPDF3.0 [SSj PDFs at 13 TeV are shown in Fig. [Ml 
The parton luminosity is defined as in Ref. 127] . All central values and uncertainty bands agree very well among 


the three global PDFs, in the x range sensitive to Higgs production. In Table ITVl we compare the predictions for 
uh from CT14 with those from MMHT2014, NNPDF3.0, and CTIO. Compared to CTIO, predicted au values for 
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FIG. 35: Contour plots of Ax^ (qs(Mz), ch) plus Tier-2 penalty in the {as{Mz), (Jh) plane, for or at the LHC 8 and 13 TeV. 


gg ^ H (pb), PDF unc., a® = 0.118 

8 TeV 

13 TeV 

68% C.L. (Hessian) 

18.7 -b 2.1% - 2.3% 

42.7 + 2.0% - 2.4% 

68% C.L. (LM) 

+2.3% - 2.3% 

+2.4% - 2.5% 

gg ^ H (pb), PDF-bOs unc. 

8 TeV 

13 TeV 

68% C.L. (Hessian) 

18.7 + 2.9% - 3.0% 

42.7 + 3.0% - 3.2% 

68 % C.L. (LM) 

+3.0% - 2.9% 

+3.2%-3.1% 


TABLE III: Uncertainties of (JH^gg —>■ H) computed by the Hessian and LM methods, with Tier-2 peiralty iircluded. The 68% 
C.L. errors are given as percentages of the central values. The PDF-only uncertainties are for as{Mz) = 0.118. 

CT14 NNLO have increased by 1-1.5%. Along with the changes also present in the updated PDFs from the two other 
PDF groups, the modest increase in the CT14 gluon brings an from the global PDF groups into a remarkably good 
agreement. The projected spread due to the latest NNLO PDFs in the total cross section an at 13 TeV will be about 
the same in magnitude as the scale uncertainty in its NNNLO prediction. 

Besides providing an estimate of the PDF uncertainty, the LM analysis allows us to identify the experimental data 
sets that are most sensitive to variations of an- In the LM scan of an, we monitor the changes of the equivalent 
Gaussian variable Sn for each included experimental data set. In the plots of S'„ values vs. an, of the type presented 



CT14 

MMHT2014 

NNPDF3.0 

CTIO 

8 TeV 

18.66+^i^: 


1 5i 77+1.8% 

18.37+1;!^: 

13 TeV 

42.68+^,°^: 

42 

42 

42.20+1;®^: 


TABLE IV: Higgs boson production cross sections (in pb) for the gluon fusion channel at the LHC, at 8 and 13 TeV center-of- 
mass energies, obtained using the CT14, MMHT2014, NNPDF3.0, and CTIO PDFs, with a common value of a^^Mz) = 0.118. 
The errors given are due to the PDFs at the 68% C.L. 
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Gluon-gluon luminosity, yfs =8 TeV, 68% c.l. 




FIG. 36: The gg PDF luminosities for CT14, MMHT2014 and NNPDF3.0 PDFs at the LHC with y/s = 8 and 13 TeV, with 
a,{Mz) = 0.118. 


in Fig. Ea we select the experiments whose S'„ (closely related to Xn) depends strongly on an. Such experiments 
typically impose the tightest constraints on an, when their Sn quickly grows with an. 

We see that, although the CMS 7 TeV inclusive jet data (538) is relatively poorly ht by CT14 NNLO, it is also 
not very sensitive to the expected Higgs cross section. The data sets most relevant to the Higgs cross section are the 
HERA inclusive data set (159) at both larger and smaller values of an, as well as combined charm production cross 
sections from HERA (147); D0 Run 2 inclusive jet (514); and CCER Ef’ (110) at larger an. At small an, the most 
sensitive data set is BCDMS (102), with some sensitivity also from E605 Drell-Yan (201) and LHCb 7 TeV charge 


asymmetry (241). Sensitivity of an to CCER dimuon data observed with CTIO 


is no longer present. 
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FIG. 37: The equivalent Gaussian variable Sn versus oh at the LHG with y/s = 8 and 13 TeV. 


B. tt production cross section at the LHC 

Next, we consider theoretical predictions and their uncertainties for the total inclusive cross section for tt production 
at the LHC, and also present some differential cross sections. 

In the tt case, the comparison between the Hessian and Lagrange multiplier methods for finding uncertainties is 
very similar to that found for the Higgs cros s sec tion. Therefore, we just present our final estimates for the total 
inclusive cross section from the TopH — h code |lOnj |. given in Table Ivl Recent experimental measurements of the total 
inclusive cross section for top-quark pair production at the LHC are given in Table IVIi together with ATLAS and 
CMS combined determinations at y/s = 7 and 8 TeV. 


pp —>■ tt (pb), PDF unc., Qs = 0.118 

7 TeV 

8 TeV 

13 TeV 

68% C.L. (Hessian) 

177 -b 4.4% - 3.7% 

253 + 3.9% - 3.5% 

823 + 2.6% - 2.7% 

68% C.L. (LM) 


+4.8% - 4.6% 

+2.9% - 2.9% 

pp —>■ tt (pb), PDF-t-Os 

7 TeV 

8 TeV 

13 TeV 

68% C.L. (Hessian) 

+5.5% - 4.6% 

+5.2% - 4.4% 

+3.6% - 3.5% 

68% C.L. (LM) 


+5.1%-4.7% 

+3.6% - 3.5% 


TABLE V: CT14 NNLO total inclusive cross sections for top-quark pair production at LHC center-of-mass energies of 7, 8, 
and 13 TeV, for an assumed top-quark mass of 173.3 GeV. 


For comparison, predictions and PDF-only errors using CTIONNLO PDFs give Oti = 246]'j3'^^ pb at 8 TeV and 
Oil = 806j'j2'2% pb 13 TeV at 68% C.L. Here we find that the Hessian and the LM methods are in very good 
agreement in CT14 at yfs = 13 TeV, and agree slightly worse at yfs = 8 TeV. Measurements of ti pair production can 
potentially constrain the gluon PDF at large x, if correlations between the gluon, as{Mz) and the top-quark mass 
are accounted for. Given the current experimental precision of ti measurements, the i mpact of such data in a global 
PDF fit is expected to be moderate; related exploratory studies can be found in Refs. |74l. Il2fl |. 

In Figs. 15511391 and 1401 the normalized top-quark transverse momentum and rapidity y distributions at approx¬ 
imate NNLO (C>(a^)) are compared to the CMS and ATLAS 591 measurements, at a center of mass energy 
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<-(pb) 

7 TeV (dilepton channel) 

8 TeV (lep-l-jets) 

ATLAS [1281.11291 

177 ± 20^***^*^ ± ± r,(iumi.) 

260 dz ^ g(lumi.) ^ ^(beam) 

CMS 11301.11311 

161.9 ± ± 3.6('™“> 

228.4 ± 9.0("<^^O+29(=y=‘) ± ;^Q(lumi.) 


7 TeV (lepton+jets, di-lepton, all-jets) 

8 TeV (dilepton channel) 

ATLAS and CMS 

Combined [1^,[132] 

173.3 ± 2.3(=*'“) ± 7.6("5'=b ± 6.3^”““) 

241.5 ± _|_ 5.7(=y=t) ± 6,2d™“l 


TABLE VI: Measurements of total inclusive cross sections for top-quark pair production at LHC center-of-mass energies of 7, 
8, and 13 TeV, for an assumed top-quark mass of 172.5 GeV. 

yi = 7 TeV. The yellow bands represent the CT14 PDF uncertainty evaluated at the 68%C.L. with the program 
DiffTop [t^ based on QCD threshold expansions beyond the leading logarithmic approximation, for one-particle 
inclusive kinematics. The value of the top-quark mass here is mt = 173.3 GeV in the “pole mass” definition. In 
Fig. |4T]the correlation cosine between the differential top-quark distribution and the momentum fraction x carried 
by the gluon is shown, in four different pr bins at the LHC ^/s = 8 and 13 TeV. The cosine correlation at ^/s = 
7 TeV exhibits identical features to that of y/s = 8 TeV. It is therefore omitted. A strong correlation between the 
Pt distribution and large x-gluon {x ~ 0.1) is observed for both LHC energies, although the cosines exhibit different 
patterns of x dependence. Finally, in Fig. |42]we present the absolute, rather than normalized, differential px and y 
distributions for top-quark production, together with the relative PDF uncertainties, at the LHC with y/s = 7, 8 and 
13 TeV. 


VI. DISCUSSION AND CONCLUSION 


In this paper, we have presented CT14, the next generation of NNLO (as well as LO and NLO) parton distributions 
from a global analysis by the CTEQ-TEA group. With rapid improvements in LHC measurements, the focus of the 
global analysis has shifted toward providing accurate predictions in the wide range of x and Q covered by the LHC 
data. This development requires a long-term multi-prong effort in theoretical, experimental, and statistical areas. 

In the current study, we have added enhancements that open the door for long-term developments in CT14 method¬ 
ology geared toward the goals of LHC physics. This is the first CT analysis that includes measurements of inclusive 
production of vector bosons and jets 51 ,0 from the LHC at 7 and 8 TeV as input for the fits. We 

also include new data on charm production from DIS at HERA 0 and precise measurements of the electron charge 


asymmetry from D0 at 9.7 fb~^ 141. These measurements allow us to probe new combinations of quark flavors that 
were not resolved by the previous data sets. As most of these measurements contain substantial correlated systematic 
uncertainties, we have implemented these correlated errors and have examined their impact on the PDEs. 

On the theory side, we have introduced a more flexible parametrization to better capture variations in the PDE 
dependence. A series of benchmark tests of NNLO cross sections, carried out in the run-up for the CT14 fit for all key 
fitted processes, has resulted in better agreement with most experiments and brought accuracy of most predictions to 
the truly NNLO level. We examined the PDF errors for the important LHC processes and have tested the consistency 
of the Hessian and Lagrange Multiplier approaches. Compared to CTIO, the new inputs and theoretical advancements 
resulted in a softer d/u ratio at large x, a lower strangeness PDF at x > 0.01, a slight increase in the large-x gluon 
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mf=173.3 GeV, LHC 7 TeV, CT14NNL0 



pV[GeV] 


LHC 7 TeV, m,= 173.3 GeV (central), CT14NNL0 



400 


FIG. 38: Normalized final-state top-quark pr differential distribution at CMS 7 TeV. 


(of order 1%), and wider uncertainty bands on d/u, d/u^ and q — q combinations at x of order 0.001 (probed by LHC 
W/Z production). Despite these changes in central predictions, the CT14 NNLO PDFs remain consistent with CTIO 
NNLO within the respective error bands. 

Some implications of CT14 predictions for phenomenological observables were reviewed in Sections IIVI and [V] 
Compared to calculations with CTIO NNLO, the gg ^ H total cross section has increased slightly in CT14: by 1.6% 
at the LHC 8 TeV and by 1.1 % at 13 TeV. The ti production cross sections have also increased in CT14 by 2.7% at 8 
TeV and by 1.4% at 13 TeV. The W and Z cross sections, while still consistent with CTIO, have slightly changed as 
a result of reduced strangeness. Common ratios of strangeness and non-strangeness PDFs for CT14 NNLO, shown in 
Eqs. dSl) and (0, are consistent with the independent ATLAS, CMS, and NOMAD determinations within the PDF 
uncertainties. 

The final CT14 PDFs are presented in the form of 1 central and 56 Hessian eigenvector sets at NLO and NNLO. 
The 90% C.L. PDF uncertainties for physical observables can be estimated from these sets using the symmetric 
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mf=173.3 GeV, LHC 7 TeV, CT14NNL0 68%CL 



Vt 

LHC 7 TeV, m,=173.3 GeV (central), CT14NNL0 68%CL 



Yt 


FIG. 39: Normalized final-state top-quark rapidity distribution at CMS 7 TeV. 


or asymmetric [5|, master formulas by adding contributions from each pair of sets in quadrature. These PDFs 
are determined for the central QCD coupling of as{Mz) = 0.118, consistent with the world-average as value. For 
estimation of the combined PDF-j-Os uncertainty, we provide two additional best-fit sets for as{Mz) = 0.116 and 
0.120. The 90% C.L. variation due to as{Mz) can be estimated as a half of the difference in predictions from the two 
as sets. The PDF-|-as uncertainty, at 90% C.L., a nd in cluding correlations, can also be determined by adding the 
PDF uncertainty and Us uncertainty in quadrature 1251. 


At leading order, we provide two PDF sets, obtained assuming 1-loop evolution of ag and as{Mz) = 0.130; and 
2-loop evolution of Ug and as{Mz) = 0.118. Besides these general-purpose PDF sets, we provide a series of (N)NLO 
sets for as{Mz) = 0.111 — 0.123 and additional sets in heavy-quark schemes with up to 3, 4, and 6 active flavors. 
Phenomenological applications of the CT14 series and the special C T14 PDFs (such as allowing for nonperturbative 
intrinsic charm contribution) will be discussed in a follow-up study 134 1. 


Parametrizations for the CT14 PDF sets are distributed in a standalone form via the CTEQ-TEA website jl36j |. or 
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FIG. 40: Normalized final-state top-quark pt differential distribution at ATLAS 7 TeV. 


B- 


as a part of the LHAPDF6 library [7|. For backward compatibility with version 5.9.X of LHAPDF, our website also 
provides CT14 grids in the LHAPDF5 format, as well as an update for the CTEQ-TEA module of the LHAPDF5 
library, which must be included during compilation to support calls of all eigenvector sets included with CT14 |l37| . 
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LHC 8 TeV, mg** = 173.3 GeV, CT14NNL0 PDFs LHC 13 TeV, = 173.3 GeV, CT14NNL0 PDFs 




FIG. 41: The correlation cosine as a function of x-gluon for the top-quark pr distribution in tt production at the LHC at y/s 
= 8 and 13 TeV. 


DiffTop approx NNLO, = 173.3 GeV, CT14 NNLO 68% C.L. 


DiffTop approx NNLO, = 173.3 GeV, CT14 NNLO 68% C.L. 




FIG. 42: Absolute differential pr and y distributions for the final-state top-quark in tt production at the LHC at ^/s = 7, 8, 
and 13 TeV. 


Appendix: Parametrizations in CT14 

Parton distribution functions are measured by parameterizing their x-dependence at a low scale Qo- For each choice 
of parameters, the PDFs are computed at higher scales by DGLAP evolution; and the parameters at Qo are adjusted 
to optimize the fit to a wide variety of experimental data. Traditional parametrizations for each flavor are of the form 

X fa{x,QQ) = X°-^ {I- Pa{x) (11) 

where the behavior at x —>■ 0 is guided by Regge theory, and the (1 —x)“^ behavior at x —>■ 1 is guided by spectator 
counting rules. The remaining factor Pa(x) is assumed to be slowly varying, because there is no reason to expect fine 
structure in it even at scales below Qq, and evolution from those scales up to Qo provides additional smoothing. 

In the previous CTEQ analyses, Pa{x) in Eq. dTTI) for each flavor was chosen as an exponential of a polynomial in 
X or i/x ; e.g., 

P(x) = exp(ao + aav/x -I- a 4 X -|- a^x^) (12) 


for u„(x) or d„(x) in CTIO 


. The exponential form conveniently enforces the desired positive-definite behavior for 
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the PDFs, and it suppresses non-leading behavior in the limit a; 0 by a factor -y/x, which is similar to what would 
be expected from a secondary Regge trajectory. However, this parametrization has two undesirable features. First, 
because the exponential function can vary rapidly, the power laws x°‘^ and (1 — a:)“^, which formally control the a; —>■ 0 
and a: —>■ 1 limits, need not actually dominate in practical regions of small x (say x < 0 . 001 ) or large x (say x > 0 . 6 ). 
Second, the qualitative similarity of exp(a 3 i/x)j exp(a 4 a:), and exp(a 5 a;^) to each other causes the parameters 03 , 
04 , Os to be strongly correlated with each other in the fit. This correlation may destabilize the minimization and 
compromise the Hessian approach to uncertainty analysis, since that approach is based on a quadratic dependence of 
X^ on the fitting parameters, which is only guaranteed close to the minimum. 

We introduce a better style of parametrization in CT14. We begin by replacing Pa{x) by a polynomial in i/x, 
which avoids the rapid variations invited by an exponential form. Low-order polynomials have been used previously 
by many other groups; however, polynomials with higher powers were less widespread. We add them now to provide 
more flexibility in the parametrization. In particular, for the best-constrained flavor combination Ui,(a::) = u(x) — u(x) 
we use a fourth-order polynomial 

= Co -b Cl 2 / -I- C2 -b C3 ?/^ -I- C 4 y^, (13) 


where y = ^/x. But rather than using the coefficients c^ directly as fitting parameters, we re-express the polynomial 
as a linear combination of Bernstein polynomials: 

Pu^ = dopoiy) + dipi{y) -b d 2 P 2 {y) + dspsiy) + diPiiy), (14) 

where 

po{y) = 

Pi{y) = 4y(l-y)^ 

P 2 {y) = 6y2(l-y)^ 

^ 3 ( 2 /) = 4y3(l-y), 

P4(y) = (15) 


This re-expression does not change the functional form of Pu„: it is still a completely general fourth-order polynomial 
in y = y/x. But the new coefficients di are less correlated with each other than the old Ci, because each Bernstein 
polynomial is strongly peaked at a different value of y. (The flexibility of the parametrization can be increased by 
using higher order polynomials; the generalization of Eq. (USD to higher orders is obvious—the numerical factors are 
just binomial coefficients.) 

In practice, we refine this procedure as follows. First, as a matter of convenience, we set ^4 = 1 and supply in 
its place an overall constant factor, which is determined by the number sum rule fgUy(x)dx = 2. We then set 
ds = 1 + ai/2 to suppress deviations from the (1 — x)“^ behavior of u.u(a;) at large x by canceling the first subleading 
power of (1 — x) in : 

xu„(x) —const X (1 — x)°‘‘^ X [1 -b 0{{\ — x)^)] for x^l . (16) 

We use the same parametrization for d„(x) = d(x) — d(x), with the same parameter values oi and 02 ; but, of 
course, independent parameters for the coefficients of the Bernstein polynomials and the normalization, which is set 
by fgdy(x) dx = 1. 
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Tying the valence ai parameters together is motivated by Regge theory, and supported by the observation that 
the value of oi obtained in the fit is not far from the value expected from Regge theory. (The ai values for u, d, u, 
and d are expected to be close to 0 from the Pomeron trajectory; but that leading behavior is expected to cancel in 
u^ = u — u and d„ = d — d, revealing the subleading vector meson Regge trajectory at oi ~ 0.5.) Not counting 
the two normalization parameters that are constrained by quark-number sum rules, we are left with a total of 8 
fitting parameters for the valence quarks. This is the same number of parameters as were used in CTIO NNLO. As 
a consistency check, we find that allowing the ai and parameters for d„ to be independent of those for Ui, would 
reduce ~ 3380 by less than one unit. Allowing those parameters to be free would also not substantially increase 
the uncertainty range given by the Hessian procedure, except at very large x, where the fractional uncertainty is 
already very large. The additional fractional uncertainty at small x generated by allowing different oi powers is also 
not important, because that uncertainty only appears in the valence quantities uix) — u(x) and d(x) — d(x); while 
most processes of interest are governed by the much larger u(x), d(x), u(x), d(x) themselves. 

In addition to theoretical arguments that the power laws 02 should be the same for u„ and d„ jl35| | , x^ tends to be 
insensitive to the differences. A large portion of the data included in the global fit are from electron and muon DIS 
on protons, which is more sensitive to u and u than to d and d because of the squares of their electric charges. Hence, 
when similar parametrizations are used for and Pd„, the uncertainties of ai(di,) and 02 (d„) are relatively large. 

Our assumption 02 (u„) = a2(d.u) forces u„(x)/d.u(x) to approach a constant in the limit x —>• 1. It allows our 
phenomenological findings to be relevant for the extensive discussions of what that constant might be [^,1^. However, 
the experimental constraints at large x are fairly weak: we can find excellent fits over the range —0.5 < 02 (d^) — 
02 (u„) < 1.2 at an increase of only 5 units in x^- Hence both u^(x)/d„(x) —>■ 0 and u„(x)/d„(x) —?► 00 at x —1 
remain fully consistent with the data. However, our assumption 02 (ui,) = 02 (di,) does not restrict the calculated 
uncertainty range materially in regions where it is not already very large. 


By way of comparison, if we use the CTIO NNLO [ 6 | form (fT^ for u^, and di,, we obtain a slightly better fit (x^ 
lower by 8 ) with an unreasonable 02 « 0.1. Similar beha vior led us to fix 02 = 0.2 in CTIO NNLO. 

In a different comparison, the MSTW2008 fit [l^. Il38 | uses a parametrization for u„ and d„ that is equivalent to 
Eq. (1131) with 03 = 04 = 0, with the power-law parameters oi and 02 allowed to differ between u^, and d^,. If we use 
this MSTW parametrization for the valence quarks at our Qq = 1.3 GeV, in place of the form we have chosen, the 
best-fit x^ increases by 64, even though the total number of fitting parameters is the same. This decline in the fit 
quality comes about because the freedom to have 02 (u„) ^ 02 (d^) and 02 ( 0 ^) ^ 02 (d„) is not actually very helpful, 
as noted above; so setting 03 = 04 = 0 does not leave an adequate number of free parameters. 

The more recent MMHT2014 |126l| PDF fit uses full fourth-order polynomials for u^, and di,. In our fit, however, 
we find that no significant improvement in x^ would result from treating d 3 (uy) and d^{dy) as free parameters, rather 
than choosing them to cancel the first subleadin g behavior at x —^ 1 , as we have done. 


Meanwhile the HERA PDF fits 


34, 
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3 


use much more restricted forms, equivalent to ci = C 2 = C 3 = 0 for 


Uy and Cl = C2 = C3 = C4 = 0 for dy. Those forms are far too simple to describe our data set: using them in place of 
our choice increases x^ by more than 200 . 

We made a case in previous work 14l| to repackage polynomial parametrizations like (1131) as linear combin ation s 


of Chebyshev polynomials of argument 1 — 2-y/x . This method has been adopted in the recent MMHT2014 fit [l^. 
However, we now contend that repackaging based on a linear combination of Bernstein polynomials, as we do in CT14, 
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is much better. The full functional forms available in the fit are, of course, the same either way. But, because each of 
the Bernstein polynomials has a single peak, and the peaks occur at different values of x, the coefficients that multiply 
those polynomials mainly control distinct physical regions, and are therefore somewhat independent of each other. 

In contrast, every Chebyshev polynomial of argument 1 — 2i/x has a maximum value ±1 at both a; = 0 and x = 1, 
along with an equal maximum magnitude at some interior points. All Chebyshev polynomials are important over the 
entire range of x, so their coefficients are strongly correlated in the ht. This causes minor difficulties in finding the 
best fit and major difficulties in using the Hessian method to estimate uncertainties based on orthogonal eigenvectors. 
Furthermore, using Bernstein polynomials makes it easy to enforce the desired positivity of the PDFs in the x 0 
and X —?■ 1 limits, because each of those limits is controlled by a single polynomial. 

We use a similar parametrization for the gluon, but with a polynomial of a lower order, because the data provide 
fewer constraints on the gluon distribution: 

fg(y) = 90 [eoqoiy) + eiqi{y) + 92 ( 2 /)] (17) 

where 

9o{y) = (l-y)^ 

qi{y) = 2y{l-y), 

92 ( 9 ) = y^- (18) 

However, in place oi y = i/x, we use the mapping 

y = 1 — (1 — -x/x)^ = 2\/x — X . (19) 

This mapping makes y = 1 — (1 — x)^/4 + 0((1 — x)^) and hence 

Pff(y) —>■ const + O ((1 — x)^) (20) 

in the limit x —>■ 1. This is an alternative way to suppress the first subleading power of (1 — x) at x —>■ 1. We have 5 
free parameters to describe the gluon distribution, including yo which governs the fraction of momentum carried by 
the gluons. The best fit has 02 = 3.8, with the range 2.6 < 02 < 5.0 allowed by an increase of only 5 in x^- 

In contrast, CTIO NNLO Q again used the form (US for the gluon distribution, where 02 was frozen at an arbitrary 
value of 10 because x^ was rather insensitive to it. That left the same number of free parameters as are used here, 
but didn’t allow anything to be learned about the behavior at very large x. 

If we use m for the gluon in our present fit, the resulting x^ is nearly as good, but again this choice yields almost 
no information about the sixth parameter 02 : a range of = 1 includes —0.4 < 02 < 12. The negative 02 part of 
that range corresponds to an integrably singular gluon probability density at x —>■ 1, which is not actually forbidden 
theoretically; but would be totally unexpected. This older parametrization would bring in unmotivated complexity in 
the large-x region that is not indicated by any present data. To test that our parametrization has adequate flexibility, 
we made similar fits using somewhat higher order Bernstein polynomials, including up to a total of 10 more free 
parameters. We calculated the uncertainty for the gg ^ H cross section at 8 TeV using the Lagrange Multiplier 
method, and found very little variation in the range of the prediction. We also calculated the range of uncertainty in 
dsiiTiz) obtained from our fits at 90% confidence (including our Tier 2 penalty). The extra freedom in parametrization 
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increased the uncertainty range only slightly: 0.111 — 0.121 using the CT14 parametrization; 0.111 — 0.123 using the 
more flexible one. 

The sea quark distributions d and u were parametrized using fourth-order polynomials in y with the same mapping 
y = 2^/x — X that was used for the gluon. We assumed u{x)ld{x) —1 at x —>■ 0, which implies ai(it) = ai(J). As 
the strangeness content is constrained rather poorly, we used a minimal parametrization Pg+s = const, with oi tied 
to the common ai of u and d. Even fewer experimental constraints apply to the strangeness asymmetry, so we have 
assumed s{x) = s(x) in this analysis. Thus, we have just two parameters for strangeness in our Hessian method: 02 
and normalization. In view of more upcoming data on measuring the asymmetry in the production cross sections of 
W + c and W + c from the LHC, we plan to include s(x) ^ s(x) in our next round of fits. 

In all, we have 8 parameters associated with the valence quarks, 5 parameters associated with the gluon, and 
13 parameters associated with sea quarks, for a total of 26 fitting parameters. Hence there are 52 eigenvector sets 
generated by the Hessian method that captures most of the PDF uncertainty. 

The Hessian method tends to underestimate the uncertainty for PDF variations that are poorly constrained, because 
the method is based on the assumption that is a quadratic function of the fitting parameters; and that assumption 
tends to break down when the parameters can move a long way because of a lack of experimental constraints. This 
can be seen, for example, for the case of the small-x gluon uncertainty, by a Lagrange Multiplier scan in which a series 
of fits are made with different values of the independent variable g{x, Q) ai x = 0.001, Q = Qq. 

In order to include the wide variation of the gluon distribution that is allowed at small x, we therefore supplement 
the Hessian sets with an additional pair of sets that were obtained using the Lagrange Multiplier method: one with 
enhanced gluon and one with suppressed gluon at small x, as was already done in CTIO. In CT14, we also include an 
additional pair of sets with enhanced or suppressed strangeness at small x; although it is possible that treating ai(s) 
as a fitting parameter independent from ai(u) = ai(d) would have worked equally well. 

In summary, we have a total of 56 error sets: 2 x 26 from the Hessian method, supplemented by two extremes of 
small-x gluon, and two extremes of small-x strangeness. Uncertainties from all pairs of error sets are to be summed 
in quadrature using the master formulas [s, 21, In comparison, CTIO NNLO had 50 error sets. The increased 


flexibility in the CT14 parametrization is warranted by better experimental constraints and its improved fit to the 
data. Indeed, fitting the CT14 data set using the old CTIO parametrizations yields a best fit that is worse by 60 units 
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